2014 NATIONAL SURVEY ON DRUG USE AND HEALTH

Transcription

2014 NATIONAL SURVEY ONDRUG USE AND HEALTHMENTAL HEALTH ESTIMATESCOMPUTED DIRECTLY FROM THECLINICAL SAMPLE OF THEMENTAL HEALTH SURVEILLANCESTUDY AND MEASURES OF THEIRSTANDARD ERRORSSubstance Abuse and Mental Health Services AdministrationCenter for Behavioral Health Statistics and QualityRockville, MarylandApril 2016

This page intentionally left blank

2014 NATIONAL SURVEY ONDRUG USE AND HEALTH:MENTAL HEALTH ESTIMATESCOMPUTED DIRECTLY FROM THECLINICAL SAMPLE OF THE MENTALHEALTH SURVEILLANCE STUDY ANDMEASURES OF THEIR STANDARDERRORSContract No. HHSS283201300001CRTI Project No. 0213757.004.107.008RTI Authors:RTI Project Director:Phillip S. KottDan LiaoDavid HunterSAMHSA Project Officer:Peter TiceSAMHSA Authors:Jonaki BoseSarra HeddenMatthew WilliamsArt HughesFor questions about this document, please e-mailPeter.Tice@samhsa.hhs.gov.Prepared for Substance Abuse and Mental Health Services Administration,Rockville, MarylandPrepared by RTI International, Research Triangle Park, North CarolinaApril 2016Recommended Citation: Center for Behavioral Health Statistics and Quality.(2016). 2014 National Survey on Drug Use and Health: Mental healthestimates computed directly from the clinical sample of the Mental HealthSurveillance Study and measures of their standard errors. Substance Abuseand Mental Health Services Administration, Rockville, MD.

AcknowledgmentsThis methodological document was prepared for the Substance Abuse and Mental HealthServices Administration, Center for Behavioral Health Statistics and Quality (CBHSQ), by RTIInternational (a registered trademark and a trade name of Research Triangle Institute). Theauthors thank Rebecca Ahrnsbrak from CHBSQ for her critical review and feedback on thisdocument.ii

Table of ContentsChapterPage1.Introduction . 12.The Scaling Factors. 52.1Estimating Prevalences . 52.2Estimating Totals . 63.Nearly Pseudo-Optimal Calibration. 74.Standard Error Estimation with WTADJX . 115.Exploring Alternative Standard Error Measures. 136.Using Different Stratum Identifiers . 197.Discussion . 21References . 25AppendixASAS-callable SUDAAN Code for Computing Standard Error Measures withWTADJX . A-1iii

This page intentionally left blankiv

List of TablesTablePage1.Alternative Standard Error Measures for Mental Health Prevalence Estimates:NSDUH Adult Clinical Interview Data File, 2008 to 2012 . 152.Summarizing Differences Between Alternative Standard Error Measure and theFully Corrected Internal Method (in Percent, using the D Statistic*) . 173.Standard Error Measures for Mental Health Prevalence Estimates Using DifferentStratum Identifiers . 194.Summarizing the Impact of Removing the ak from Equation (2) on the FullyCorrected Estimator (in Percent, Applying the D Statistic to CVs*) . 22v

This page intentionally left blankvi

1. IntroductionThe National Survey on Drug Use and Health (NSDUH), conducted by the SubstanceAbuse and Mental Health Services Administration (SAMHSA), is one of the primary sources ofdata for population-based prevalence estimates of substance use and mental health indicators inthe United States. The NSDUH interview includes several self-administered indicators of mentalhealth, such as assessments of lifetime and past year major depressive episode (MDE), pastmonth and past year psychological distress and functional impairment, as well as past yearsuicidality. From 2008 to 2012, a subsample of NSDUH adult respondents were selected toparticipate in the Mental Health Surveillance Study (MHSS), which was a telephone interviewthat included clinical assessments of the presence of selected mental disorders. MHSS cliniciansadministered semistructured diagnostic interviews to this subsample to assess the presence ofselected mental disorders (Aldworth et al., 2010).The purpose of the MHSS clinical component was to develop a statistical model to applyto the full NSDUH sample that would generate serious mental illness (SMI) prevalence estimatesamong adults (aged 18 years or older) at national and state levels and to monitor the prevalenceof SMI over time.In addition to producing a model for the NSDUH to yield model-based estimates of SMIamong adults (Center for Behavioral Health Statistics and Quality [CBHSQ], 2015a), the 2008 to2012 MHSS clinical data can be used to generate nationally representative prevalence estimatesof past year mental disorders among the adult civilian, noninstitutionalized population in 2008 to2012, across a wide spectrum of diagnostic categories, including mood disorders (majordepressive disorder [MDD], bipolar I disorder, and/or dysthymic disorder), anxiety disorders(posttraumatic stress disorder [PTSD], panic disorder with and without agoraphobia, agoraphobiawithout history of panic disorder, social phobia, specific phobia, obsessive compulsive disorder[OCD], and/or generalized anxiety disorder [GAD]), eating disorders (anorexia nervosa and/orbulimia nervosa), substance use disorders (alcohol abuse, alcohol dependence, illicit drug abuse,and/or illicit drug dependence), intermittent explosive disorder, adjustment disorder, as well aspsychotic symptoms (delusions and/or hallucinations). Karg et al. (2014) presents the past 12month prevalence estimates of specific mental disorders using the MHSS clinical data.This document focuses on how the prevalence estimates and their standard errors werederived from the 2008 to 2012 MHSS clinical sample. In particular, it describes how theprevalence estimates covering the 2008 to 2012 time period were computed using samplingweights that had undergone a number of calibration adjustments with an emphasis on the lastadjustment: poststratification—the annual calibration of the clinical sample to the NSDUHcontrol totals. It then discusses several alternative methods for measuring the standard errors ofthose estimates. Consistent with how standard errors of NSDUH estimates are computed(CBHSQ, 2015a), all these methods use Taylor-series linearization variance estimators. Thepurpose of this document is not only to provide information on how the standard errors werecomputed in our existing MHSS reports, but also provide users of these data information on howthey can compute standard errors to determine precision levels and conduct statistical inference.1

The focus of this report will be on the statistical rather than measurement issues. That is,the statistical analyses discussed in this document assume the diagnostics made by the mentalhealth professional during the clinical interview using the Structured Clinical Interview for theDSM-IV-TR Axis I Disorders, Research Version, Non-patient Edition (SCID-I/NP) (First,Spitzer, Gibbon, & Williams, 2002) are accurate.In addition, these analyses treat the yearly clinical samples as pure random samples—ofboth adults and time periods—with probabilities of selection accurately captured by the sampleweights before the final poststratification. Therefore, these selection probabilities incorporateadjustments for unit nonresponse (unit response is treated like self-selection) and the exclusionfrom the clinical sample of adults who responded to the NSDUH main interview in Spanish.A prevalence estimate is an estimated mean for a population. When measuring thestandard error of a complex function of estimated population means, such as a regressioncoefficient, final weights can be treated as sampling weights because calibration adjustmentshave only marginal impact on these standard errors (except the impact resulting from changes inthe weights themselves, which is captured by using the final weights). Moreover, software thatfully incorporates the impact of calibration weighting using linearization techniques is notpresently available.More details on the probability sampling and weighting process can be found in Liao etal. (2014). Briefly, annual clinical sample weights were the product of five factors. Therespondent’s NSDUH main interview analysis weight was adjusted with a coverage adjustment to compensate for NSDUH main survey respondents whocompleted the survey in Spanish; the inverse of the probability the respondent was also selected for the clinical sample(the selection probability into the clinical sample was an independent function of anadult’s NSDUH main interview responses, which varied across the years); a refusal adjustment to compensate for NSDUH respondents selected for clinicalevaluation who did not wish to be recontacted; a second nonresponse adjustment to compensate mostly for those who agreed to berecontacted but were unavailable for the clinical evaluation (this included a few whoagreed to be recontacted for the evaluations, but refused to respond whenrecontacted); and a poststratification adjustment to increase the efficiency of direct estimates from theclinical sample.Strictly speaking the last adjustment is a calibration to totals computed from the NSDUHmain interview respondents. We follow the terminology of SUDAAN 11 (RTI International,2012) and call this adjustment "poststratification," even though the totals computed from theNSDUH main-interview responses were not for mutually exclusive groups.This document focuses on that last weighting adjustment, but other features of the MHSSrequire some discussion. First, the adult NSDUH main sample in 2008 was randomly dividedinto two halves. One half sample, denoted the 2008A sample, was administered functional2

impairment questions based on an abbreviated version of the World Health OrganizationDisability Assessment Schedule (WHODAS; Rehm et al., 1999). The other half sample, the2008B sample, was administered questions based on the Sheehan Disability Scale (SDS; Leon,Olfson, Portera, Farber, & Sheehan, 1997). Both halves received psychological distress questionsbased on the Kessler 6 scale (K6; Kessler et al., 2003). From 2009 onward, only the WHODASand K6 questions were used on the NSDUH main survey.Weights were constructed separately each year, treating the 2008A and 2008B clinicalsamples as if they represented distinct years. In 2008 and 2009, these single-year samples wereused to develop and verify statistical models that predicted SMI; however, because of to thesmall sample sizes (759 in 2008A, 741 in 2008B, 520 respondents from 2009, 516 in 2010, 1,495in 2011, and 1,622 in 2012) the entire 5-year sample was used to produce the final statisticalmodels. Similarly, the small sample sizes prevented annual estimation of the direct estimates aswell.Consequently, the clinical samples were also combined across the years to generateprevalence estimates of mental disorders. Because the sample size, sampling allocation, andweight adjustments for the clinical sample differed from year to year, gains in statisticalefficiency could be realized by scaling the weights instead of letting each year contribute equallyto the estimates.These scaling factors were determined by focusing on the standard errors of prevalenceestimates for SMI, any mental illness (AMI), and the occurrence of MDE in the previous year. Adiscussion of the assumptions underlying the determination and use of these factors and theirimplications on the estimation of prevalences and the annual numbers of adults with specificmental disorders over the 2008 to 2012 time period is contained in Chapter 2.Chapter 3 describes how the clinical sample was calibrated to the NSDUH main sampleeach year in a nearly pseudo-optimal fashion (Kott, 2011). Chapter 4 shows how the WTADJXroutine in SUDAAN 11 (RTI, 2012) was used to estimate yearly standard errors for totals andprevalences. As noted earlier, this chapter treats the weights before the final calibration as pureprobability sampling weights based on the idea that this will, if anything, tend to overestimatestandard errors (Kott & Day, 2014).Chapter 5 describes how the standard error measures for prevalence estimates werecalculated and discusses the implications of the displayed results. Because the weights for eachyear (2008A, 2008B, 2009, 2010, 2011, and 2012) were scaled when estimating the prevalences,the same scaling factors were used in computing the standard error measures.The 900 original NSDUH variance strata (CBHSQ, 2015a) were collapsed into 100MHSS variance strata so that MHSS standard error measures could be computed for Karg et al.(2014). An alternative set of collapsed strata had been employed in determining clinical weightsand in modeling SMI (see Liao et al., 2014). Chapter 6 compares the standard errors computedusing the two different sets of variance strata.Chapter 7 provides further discussion of the statistical results in this document. It shouldbe mentioned that by using WTADJX to measure standard errors, it is not possible, with the3

software presently available, to conduct a Wald/F test when comparing prevalence estimatesacross three or more groups. That is why Bonferroni-adjusted t tests were used when comparingprevalence estimates across age groups in Karg et al. (2014).4

2. The Scaling Factors2.1Estimating PrevalencesThe yearly prevalence estimates for serious mental illness (SMI) were scaled to comeclose to minimizing the variance for the adult SMI prevalence estimate in 2008 to 2012. For theresults from scaling the weights across years to be most relevant for prevalence estimates, weneed either to (1) assume the underlying mental-health prevalence being estimated is constantacross the years from 2008 to 2012, or (2) treat the target of estimation as the weighted mean ofthe annual prevalences, where the weight applied to each year is its scaling factor times itsrelative population size.Mathematically, the true average prevalence from 2008 to 2012 can be expressed asY N 2008Y2008 N 2009Y2009 N 2010Y2010 N 2011Y2011 N 2012Y2012,N 2008 N 2009 N 2010 N 2011 N 2012where Nt and Yt are, respectively, the adult population size and the prevalence in year t. Theassumption-free target of the scaled estimates is instead:Yscaled (.12) N 2008Y2008 (.04) N 2009Y2009 (.14) N 2010Y2010 (.35) N 2011Y2011 (.35) N 2012Y2012.(.12) N 2008 (.04) N 2009 (.14) N 2010 (.35) N 2011 (.35) N 2012(1)We investigated the reasonableness of the former assumption that the 43 underlyingmental health prevalence estimates were constant from 2008 to 2012 by computing the 5 yearlyestimates for each variable (combining the 2008A and 2008B samples), then the standard errorsof the 10 5 2 paired comparisons (e.g., the 2008 estimate for past year explosive disorderminus the 2010 estimate) using the fully corrected internal version of the standard error measure.We deemed a difference (e.g., between the 2008 and 2010 estimates of a prevalence) tobe statistically significant if the smallest of the 10 p-values per variable was less than .01. Therewas less than a 10 percent chance of this happening under the null hypothesis of an unchangingprevalence across the 5 years. Note that .01 is a Bonferroni adjustment applied to .1 (i.e.,.01 .1/10, with 10 being the number of paired comparisons per variable).Three of the differences were statistically significant, which is about what should beexpected with 43 variables (i.e., less than 4.3). There were 430 (43 x 10) paired comparisons inall. If we had alternatively used a Bonferroni adjustment for the lowest p-value of the 430(.00030), the difference—and thus no difference—would be significant at the .1 level.This means the clinical data were consistent with the null hypothesis of each prevalenceremaining constant from 2008 to 2012. Note, however, that yearly sample sizes were small, so5

our failure to reject the null hypothesis may have more to do with a lack of power than theunderlying truth of the null hypothesis. 12.2Estimating TotalsIt is more problematic to use the scaled weights when estimating the average yearlynumber of adults with a mental health disorder from 2008 to 2012 overall or within somedemographic group (e.g. Hispanics) than when estimating yearly prevalences. This method,simply summing the scaled weights of relevant clinical interview respondents (wheremembership in the demographic group of interest defines relevance when needed), was used inKarg et al. (2014).Scaling the weights actually estimates the numerator of equation (1); that is, the numberof adults in the group having the disorder weighted by .12 in 2008, the number in 2009 by .04,and so forth. A more natural estimation target would weight each year equally. These targets areclearly different because the population grew between 2008 and 2012.A possible alternative method for estimating an average yearly number of relevant adultswith a mental disorder would be to compute the product of the relevant prevalence estimate calculated with the scaled weights, and the average yearly relevant population total computed from the main National Surveyon Drug Use and Health (NSDUH) sample (i.e., scaling the main NSDUH weights by1/5).This alternative approach not only has a more natural estimation target, it should alsoresult in smaller standard errors because it uses population estimates from the main NSDUHsample, which is considerably larger than the clinical sample. The main drawback of this“product” method is that it is more cumbersome to produce, requiring the computation of twostatistics (one from the clinical sample and one from the main NSDUH sample) for eachestimate. Another is that the standard error measures for the product estimates would be ad hoc. 2By contrast, computing a standard error measure for an estimated average yearly total calculatedwith the scaled weights is straightforward.To illustrate the power, or lack of it, in our original Bonferroni-adjusted test, look at the yearly estimatedprevalences of past year alcohol dependence or abuse from 2008 to 2012. The lowest p-value among the 10 pairwisecomparisons is .012, which is greater than .010 and so not statistically significant at the Bonferroni-adjusted 0.1level. (Note: the lowest pairwise comparison p-value for only 3 of the 43 variables is below .012.) The yearlyprevalence estimates that were not significantly different for past year alcohol dependence or abuse as determinedby this test ranged from 3.67 percent to 8.57 percent.2The standard error measure could be converted for the estimated prevalence component into a coefficientof variation and then multiplied by the estimated average yearly population component. This assumes that therandom nature of the latter estimate makes a negligible contribution to the variance of the product.16

3. Nearly Pseudo-Optimal CalibrationThe Mental Health Surveillance Survey (MHSS) clinical samples were calibratedseparately in each year. This section describes how calibrat

The National Survey on Drug Use and Health (NSDUH), conducted by the Substance . to the full NSDUH sample that would generate serious mental illness (SMI) prevalence estimates among adults (aged 18 years or older) at national and state levels and to monitor the prevalence . software