Explaining Heterogeneity - Cochrane

Transcription

Explaining heterogeneityJulian HigginsSchool of Social and Community Medicine, University of Bristol, UK

Outline1. Revision and remarks on fixed-effect and random-effects metaanalysis methods (and interpretation under heterogeneity)Explaining heterogeneity:2. Subgroup analysis3. Meta-regression4. Problems5. Closing remarks Example 1: Trials of exercise for treatment of depression

Part 1: Revision and remarks on fixed-effectand random-effects meta-analysis methods3

Revision (1) Fixed-effect(s) meta-analysis weights w i 1vi (minimise variability of thesummary estimate)Estimates and 95% confidence enMcNeilMutrieReuterSinghVeale-1.01 (-1.24 to -0.79)-2-101Standardized mean differenceFavours exerciseFavours control

Revision (2)Estimates and 95% confidence intervalsStudy examine whether effectvary across lQ 35.4 (9 d.f.)(p 0.00005)τDL 0.64 (for example)I2 75%MutrieReuterSinghVeale-2-101Standardized mean differenceFavours exerciseFavours control

Revision (3)Estimates and 95% confidence intervalsStudy Meta-analysis underrandom-effects assumption effect size varies acrossstudies between-studyvariance eReuterSingh weightsVeale1 wi 2v i ̂-1.06 (-1.53 to -0.59)-2-101Standardized mean differenceFavours exerciseFavours control

Three simple models for relationships ofeffect size parameters across studiesThe underlying effect sizes are identicalEffect sizeCommon-effect model(or fixed-effect model)The underlying effect size are differentand related through some distributionEffect sizeRandom-effects modelThe underlying effect sizes are differentand unrelated (independent)Effect sizeFixed-effects model

Two simple methods for combining effectsize estimates across studiesCommon-effect modelθ ˆ Effect size w y wiRandom-effects modelw*y ˆ w *iiiEffect sizeθkFixed-effects model ˆ Effect sizeVariance:iμθ1i w y wiiEstimate of: θ1 wiwi Estimate of: μ1Variance: wi *Estimate of:w i w wiii1si21Variance: wii1si2 ˆ2

Three simple models for relationships ofeffect size parameters across studiesEffect sizeCommon-effect model(or fixed-effect model)Complication is that the fixed-effectmeta-analysis method can beinterpreted under either of thesemodelsEffect sizeFixed-effects model

Aside: an interpretation of the weightedaverage under fixed-effects modelp r p ri iii i A weighted average of the study-specific effects seen in the individual studypopulations, where the weight for study population i is proportional to boththe number of participants selected (pi) and to how much information iscontributed per subject (ri) No assumption of homogeneity is required But heterogeneity is ignored in the analysis so inference on this is required this inference is statistically independent The empirical weights are poor estimates of piri unless the studies are largeRice, Higgins and Lumley (submitted)

What is heterogeneity? Variation beyond that expected by chance alone Observed result true effect bias random error True effects vary due to ‘clinical diversity’ variants in PICO Biases vary due to ‘methodological diversity’ design, conduct, problems In practice, true effects and biases inseparable, so group themtogether Variation in true effect bias measured by the between-studyvariance 2

Exploring heterogeneityIn which types of trials does the intervention work best? Characteristics of studies may be associated with the size oftreatment effect Does the intervention work better if given for longer? Are smaller odds ratios observed in high-risk populations? Is there a relationship between sample size and effect size(e.g. due to publication bias)? Is inadequate allocation concealment associated with a largereffect estimate? Is A better than B, when they’ve each only been comparedwith C? For discrete characteristics, can use subgroup analyses We can use subgroup analyses or meta-regression to answerquestions like these

Subgroup analysisEstimates and 95% confidence intervals Divide up the studies e.g. by duration of trial Test for subgroupdifferences: can apply Q test tosubgroup results here, P 0.28 8 weeks follow-upKleinMartinsenReuterSinghVeale-0.82 (-1.46 to -0.19)4-8 weeks follow-upDoyneEpsteinHess-HomeierMutrieMcNeil-1.33 (-1.99 to -0.67)-2-101Standardized mean differenceFavours exerciseFavours control

Meta-regressionEstimates and 95% confidence intervals Examine heterogeneityFollow-upKleinVeale12 Predict effect accordingto length of stein8 Difference in SMDs 0.18(95% CI 0.02 to 0.34):i.e. SMD decreases by 0.2for each extra week offollow upMcNeil6Mutrie4-2-101Standardized mean differenceFavours exerciseFavours control

Part 2: Subgroup analysis15

Subgroup analyses Split data into subgroups: Subsets of patients Subsets of studies Do separate meta-analyses for each Subgroup analyses done for three reasons To answer multiple questions (one for each subset of studies) To examine whether the effect is different in differentsubgroups (comparing subgroups) To assess whether restricting analysis to one subgroupchanges the conclusion (sensitivity analysis)

Subgroup analysesBehavioural interventions for smokeless tobacco use cessationAbstinence from all tobacco use (where reported) at 6 months or more

Subgroup analyses

Comparing two subgroups Simple statistics: take the difference in point estimates and addtheir variances all on the log scale for ratio measuresSubgroupORRandomisation 70.994DIFF-0.258-0.006SUM0.016Comparison ofsubgroupsRATIO0.773OR CI low OR CI upp0.60119lnORlnOR CI low lnOR CI upp lnOR var-0.510

Fixed-effects test for subgroup differences Simple methodQall: chi-squared statistic for all the studiesQ1, , Qm : chi-squared statistic for m different subgroups Heterogeneity explained by differences between subgroups(Qbet):Qbet Qall – (Q1 Qm) Has degrees of freedomdf m – 1

Fixed-effects test for subgroup differencesQ1Q2Qall

Fixed-effects test for subgroup differencesQbet 34.42 – (24.38 6.00) 4.04 Two subgroups, so 2 – 1 1 degree of freedom Chi-squared value of 4.04 with 1 degree of freedom:P 0.044

Test for subgroup differences

General test for subgroup differences Just do a test for heterogeneity across the results of all thesubgroups Fine for any number of subgroups, and any type of analysiswithin the subgroups“Study 1”“Study 2”

Part 3: Meta-regression26

Meta-regression A general framework for looking at possible explanations fordifferences in effect sizes across studies A test for subgroup differences can be done using metaregression But meta-regression also good for continuous characteristics Linear regression outcome variable effect size (e.g. SMD or logRR) explanatory variable summary description of study (e.g.duration or location)

-1-2Best fit line-3SMD0Simple linear regression468Trial duration1012

Why we don’t use simple linear regression Just as in meta-analysis, the studies are different sizes, andshould have different influences on the analysis There’s a difference betweenbig studysmall studySMD-3-3-2-2and-1-100SMD46810Trial duration1246810Trial duration12

Has effectiveness of fluoride gels changed over time?0-.5‘Bubble plot’:Circles proportional to inverse-variance(weight in a meta-analysis)-1Prevented fraction.51Marinho et al (2004)1960197019801990Year2000

-2-1.5-1SMD-.50.5Number of sessions of CBT5101520Number of Sessions2530

Fixed-effect meta-analysisCommon treatment effectRandom errorEffect estimateTreatmenteffect

Fixed-effect meta-regressionTreatment effect intercept slope xExplanatoryvariable, xRandom errorEffect estimateTreatmenteffect

The other way round(the convention for plotting)TreatmenteffectExplanatory variable, x

Random-effects meta-analysisMean treatment effectRandom errorEffect estimate tot2Treatmenteffect

Random-effects meta-regressionMean treatment effect intercept slope xExplanatoryvariable, xRandom errorEffect estimate reg2Treatmenteffect

Proportion of heterogeneity explained Compareheterogeneity variance from random-effects2meta-analysis ( tot)withheterogeneity variance from random-effects2meta-regression ( reg)22 tot reg % variance explained 100% 2 tot A useful measure of the explanatory ability of a (set of)covariate(s)

Example - BCG vaccinationIt has been recognised for many years that the protection givenby BCG against tuberculosis varied between trialsRisk ratios:MRC trial (UK)Madras, (South India)0.24 (95% CI 0.18 to 0.31)1.01 (95% CI 0.89 to 1.14)Random-effects meta-analysis (Colditz et al. 1994)Summary (random effects) RR: 0.49 (95% CI 0.34 to 0.70)“the results of this meta-analysis lend added weight andconfidence to arguments favouring the use of BCG vaccine”

BCG vaccination to prevent tuberculosisStudyIDRR (95% CI)10.41 (0.13, 1.26)20.20 (0.09, 0.49)30.26 (0.07, 0.92)40.24 (0.18, 0.31)50.80 (0.52, 1.25)60.46 (0.39, 0.54)70.20 (0.08, 0.50)81.01 (0.89, 1.14)90.63 (0.39, 1.00)100.25 (0.15, 0.43)110.71 (0.57, 0.89)121.56 (0.37, 6.53)130.98 (0.58, 1.66)Overall (I-squared 92.1%, p 0.000)0.49 (0.34, 0.70)with estimated predictive interval.NOTE: Weights are from random effects analysis.1.2.5Favours vaccination12510Favours no vaccinationRisk ratio(0.14, 1.77)

BCG vaccination to prevent tuberculosisStudyAbsoluteIDRR (95% CI)latitude11.01 (0.89, 1.14) 1320.80 (0.52, 1.25) 1330.71 (0.57, 0.89) 1840.20 (0.08, 0.50) 1950.63 (0.39, 1.00) 2760.98 (0.58, 1.66) 3371.56 (0.37, 6.53) 3380.25 (0.15, 0.43) 4290.26 (0.07, 0.92) 42100.46 (0.39, 0.54) 44110.41 (0.13, 1.26) 44120.24 (0.18, 0.31) 52130.20 (0.09, 0.49) 55Overall (I-squared 92.1%, p 0.000)0.49 (0.34, 0.70)with estimated predictive interval.(0.14, 1.77)NOTE: Weights are from random effects analysis.1.2.5Favours vaccination12510Favours no vaccinationRisk ratioBerkey et al, 1995

Meta-regression: Dependence of BCGvaccine efficacy on study latitude1.5 2 (m-a) 0.313 2 (m-reg) 0.076Risk ratio.51so 76% of variance isexplained10203040Latitude5060

Hartung-Knapp Simulation study:“Of the different methods evaluated, only the Knapp and Hartungmethod and the permutation test provide adequate control of theType I error rate across all conditions. Due to its computationalsimplicity, the Knapp and Hartung method is recommended as asuitable option for most meta-analyses.”Viechtbauer et al 2015 More on permutation tests later42

Variance explainedWithinstudies:8%Between studies(I2):92%Unexplained:24%Explained by latitude (R2):76%

Subgroup analysis vs meta-regression The key difference is the assumption made about heterogeneityin random-effects analyses estimated separately for each subgroup in subgroup analyses (as usually implemented) assumed equal across subgroups in meta-regression (as usually implemented) Of course, there is flexibility to do things in different ways some software assumes equal heterogeneity whenperforming subgroup analyses44

. metan smd sesmd, random by(duration) label(namevar study)Study%IDES (95% CI)WeightReuter-2.10 (-2.88, -1.32)10.04Martinsen-1.16 (-1.71, -0.61)11.72Klein0.25 (-0.75, 1.25)8.54Singh-0.45 (-1.12, 0.22)10.89-0.53 (-1.00, -0.06)12.24-0.82 (-1.46, -0.19)53.44Doyne-1.20 (-2.04, -0.36)9.62Hess-Homeier-0.82 (-1.94, 0.30)7.79Epstein-0.84 (-1.74, 0.06)9.21-1.07 (-1.87, -0.27)9.90-2.53 (-3.31, -1.75)10.04-1.33 (-1.99, -0.66)46.56-1.06 (-1.53, -0.59)100.00Long ( 8 weeks)Vealeτ2 0.37Subtotal (I-squared 78.7%, p 0.001).Short (4-8 weeks)McNeilτ2 0.40MutrieSubtotal (I-squared 64.8%, p 0.023).Overall (I-squared 74.6%, p 0.000)NOTE: Weights are from random effects analysis-3.310Tests forsubgroupdifferences1. Q test fromsubgroup results:P 0.28(RevMan/metan)2. Metaregression:P 0.34(metareg)3.31. metareg smd duration, wsse(sesmd) mmMeta-regressionNumber of studies 10Method of moments estimate of between-study variance: tau2 .3893smd Coef.Std. Err. tP t [95% Conf. Interval]duration -.5026523 .4990484 -1.01 0.343 -1.65346.6481553

Interpretation of meta-regressioncoefficientsContinuous outcome(MD/SMD)Dichotomousoutcome (OR/RR)ContinuousexplanatoryvariableDifference in meandifferences or in SMDsassociated with 1 pointincrease in covariateRatio of ORs or of RRsassociated with 1point increase e in meandifferences or in SMDsbetween 2 groupsRatio of ORs or of RRsbetween 2 groupsMeta-regression with dichotomous variable gives formal comparison betweensubgroups

Part 4: Problems47

Problems with exploration of heterogeneity Observational relationships Confounding (correlation between characteristics)Treatment effectQualityYear of randomizationConfounder Aggregation bias(associated withtreatment effect andyear) Low power with few studies: false-negative findings Multiple characteristics: false-positive findingsThompson & Higgins (2002)

Aggregation bias Beware study characteristics that summarize participants within astudy, e.g. average age % females average duration of follow-up % drop out

10Relationship between treatment effect and average age21SMD.5.1.0130405060Average age in each study70

10Relationship between treatment effect and average age21SMD.5.1.0130405060Average age in each study70

Aggregation bias Beware study characteristics that summarize participants within astudy, e.g. average age % females average duration of follow-up % drop out Relationships across studies may not reflect relationships withinstudies The relationship between treatment effect and age, sex, etc isbest measured within a study Needs individual participant data

Lack of power Unfortunately most meta-analyses in Cochrane reviews do nothave many studies Meta-regression typically has low power to detect relationships Model diagnostics / adequacy difficult to assess

Selecting explanatory variables There are typically many, many explanatory variables to choosefrom Heterogeneity can always be explained if you look at enoughof them Great risk of spurious findings

False-positive findings To guard against false-positives, meta-analysts are advised to Pre-specify characteristics in a protocol Limit to a small number Have a scientific rationale Beware ‘prognostic factors’ Variables that predict clinical outcome don’t necessarily affecttreatment effects e.g. age may be strongly prognostic, but risk ratios may well be thesame irrespective of age

Risk of false-positive findings Depends on number of studies extent of heterogeneity precision of effect estimatesAve. age8 wks16 wks24 wks0.51Risk ratio20.51Risk ratio2

Simulation study of false-positive rates15%20%25%Fixed effect5%10%Random effects0%Significance levelExample results: 10 studies, typical study weights, 1 covariate050%Extent of inconsistency (I2)83%

Solution? A permutation testDuration8 wks16 wksP 0.000125 wks0.512Risk ratio But 1/3 of possible permutations give a relationshipat least as strong as this So ‘real’ p 0.33

IllustrationStudy12345678910111213Meta-regression test 52181319331344331944135527424213T(1000)

Permutation test p-value Compare Tobs with the values T(1) , , T(1000) Count the number of T ’s that exceed Tobs e.g. for BCG example Tobs 2.44 ( slope/SE); 54/1000 T ’s exceed Tobs : p 0.054 c.f. FE:p 10–10 RE (standard normal): p 0.012

Multiple testing example: Exercise on0000001101PhD00001100001Standardized mean differenceFavours exerciseFavours controlRE meta-regression (STATA)Perm. test for “nth” most significantp 0.0006p 0.01p 0.24p 0.25p 0.98p 0.11p 0.09p 0.25p 0.10p 0.88

Part 5: Closing remarksSelecting explanatory variablesMeta-regression in Cochrane reviewsExtensionsSummary62

Selecting explanatory variables Specify a small number of characteristics in advance Ensure there is scientific rationale for investigating eachcharacteristic effect modifier or prognostic factor? Make sure the effect of a characteristic can be identified does it differentiate studies? aggregation bias Think about whether the characteristic is closely related toanother characteristic confounding

How many studies / characteristics? Typical guidance is to have 10 studies for each characteristicexamined Some say 5 studies is enough

Including meta-regression in a Cochranereview Authors are encouraged to use meta-regression if appropriate‘Bubble’ plots may be included as an ‘Other figure’Results should be presented in an ‘Additional table’Consider presenting something like:ExplanatoryvariableDurationSlope orExp(slope)95%confidenceintervalProportionof variationexplainedOR 1.30.9 to 1.814%InterpretationOdds ratio increaseswith duration(not statisticallysignificant)

Extensions Baseline risk of the studied population (as measured in thecontrol group) might be considered as an explanatory variable Beware! It is inherently correlated with treatment effects Special methods are needed (see Thompson et al 1997) Precision of the estimate (e.g. its standard error) might beconsidered as an explanatory variable to look at small studyeffects (may be suggestive of publication bias) Beware! It is often inherently correlated with treatmenteffects Special methods are needed (see Sterne et al 2011)

Key messages Subgroup analysis and meta-regression examine the relationshipbetween treatment effects and one or more study-levelcharacteristics Meta-regression is like usual linear regression, but study weights incorporated variation not explained by the characteristics should beaccounted for random effects meta-regression does this;fixed effect meta-regression does not

Key messages Subgroup analyses are meta-regression are great in theory, andeasy to perform in STATA or R (and other software) They are fraught with dangers and should be undertaken andinterpreted with caution observational relationships too few studies too many sources of diversity and bias confounding and aggregation bias

References Berkey CS, Hoaglin DC, Mosteller F and Colditz GA. A random-effects regression modelfor meta-analysis. Statistics in Medicine 1995 14, 395-411Borenstein et al, Introduction to Meta-analysis, chapters 19, 20, 21Higgins JPT, Thompson SG. Controlling the risk of spurious results from meta-regression.Statistics in Medicine 2004; 23: 1663-1682Higgins J, Thompson S, Altman D and Deeks J. Statistical heterogeneity in systematicreviews of clinical trials: a critical appraisal of guidelines and practice. Journal of HealthServices Research and Policy 2002; 7: 51-61Hughes JR, Stead LF, Lancaster T. Antidepressants for smoking cessation (CochraneReview). In: The Cochrane Library, Issue 4, 2004. Chichester, UK: John Wiley & Sons, LtdMarinho VCC, Higgins JPT, Logan S, Sheiham A. Fluoride gels for preventing dental cariesin children and adolescents (Cochrane Review). In: The Cochrane Library, Issue 3, 2004.Chichester, UK: John Wiley & Sons, LtdSterne JAC, Sutton AJ, Ioannidis JPA, et al. Recommendations for examining andinterpreting funnel plot asymmetry in meta-analyses of randomised controlled trials.BMJ 2011; 343: d4002

References Thompson SG. Controversies in meta-analysis: the case of the trials of serumcholesterol reduction. Statistical Methods in Medical Research 1993; 2: 173-192Thompson SG. Why sources of heterogeneity in meta-analysis should be investigated.BMJ 1994; 309: 1351-1355Thompson SG, Higgins JPT. How should meta-regression analyses be undertaken andinterpreted? Statistics in Medicine 2002; 21: 1559-1574Thompson SG, Sharp SJ. Explaining heterogeneity in meta-analysis: a comparison ofmethods. Statistics in Medicine 1999; 18: 2693-2708Thompson SG, Smith TC and Sharp SJ. Investigation underlying risk as a source ofheterogeneity in meta-analysis. Statistics in Medicine 1997; 16, 2741-2758van Houwelingen HC, Arends LR, Stijnen T. Tutorial in Biostatistics: Advanced methodsin meta-analysis: multivariate approach and meta-regression. Statistics in Medicine2002; 21: 589–624Viechtbauer W, López-López JA, Sánchez-Meca J, Marín-Martínez F. A comparison ofprocedures to test for moderators in mixed-effects meta-regression models.Psychological Methods 2015; 20: 360-74.

all: chi-squared statistic for all the studies Q 1, , Q m: chi-squared statistic for m different subgroups Heterogeneity explained by differences between subgroups (Q bet): Q bet Q all -(Q 1 Q m) Has degrees of freedom df m -1. Fixed-effects test for subgroup differences Q all Q 2 Q 1.