Understanding Advanced Statistical Methods

Transcription

StatisticsProviding a much-needed bridge between elementary statistics courses andadvanced research methods courses, Understanding Advanced StatisticalMethods helps you grasp the fundamental assumptions and machinery behindsophisticated statistical topics, such as logistic regression, maximum likelihood,bootstrapping, nonparametrics, and Bayesian methods. The book teaches youhow to properly model, think critically, and design your own studies to avoidcommon errors. It leads you to think differently not only about math and statisticsbut also about general research and the scientific method.Enabling you to answer the why behind statistical methods, this text helps yousuccessfully draw conclusions when the premises are flawed. It empowers youto use advanced statistical methods with confidence and develop your ownstatistical recipes.Westfall HenningWith a focus on statistical models as producers of data, the book enables you tomore easily understand the machinery of advanced statistics. It also downplaysthe “population” interpretation of statistical models and presents Bayesianmethods before frequentist ones. Requiring no prior calculus experience, the textemploys a “just-in-time” approach that introduces mathematical topics, includingcalculus, where needed. Formulas throughout the text are used to explain whycalculus and probability are essential in statistical modeling. The authors alsointuitively explain the theory and logic behind real data analysis, incorporatinga range of application examples from the social, economic, biological, medical,physical, and engineering sciences.Understanding AdvancedStatistical MethodsUnderstanding AdvancedStatistical MethodsTexts in Statistical ScienceUnderstanding AdvancedStatistical MethodsPeter H. WestfallKevin S. S. HenningK14873K14873 Cover.indd 13/12/13 2:26 PM

Understanding AdvancedStatistical Methods

CHAPMAN & HALL/CRCTexts in Statistical Science SeriesSeries EditorsFrancesca Dominici, Harvard School of Public Health, USAJulian J. Faraway, University of Bath, UKMartin Tanner, Northwestern University, USAJim Zidek, University of British Columbia, CanadaAnalysis of Failure and Survival DataP. J. SmithThe Analysis of Time Series —An Introduction, Sixth EditionC. ChatfieldApplied Bayesian Forecasting and Time SeriesAnalysisA. Pole, M. West, and J. HarrisonApplied Categorical and Count Data AnalysisW. Tang, H. He, and X.M. TuApplied Nonparametric Statistical Methods,Fourth EditionP. Sprent and N.C. SmeetonApplied Statistics — Handbook of GENSTATAnalysisE.J. Snell and H. SimpsonApplied Statistics — Principles and ExamplesD.R. Cox and E.J. SnellApplied Stochastic Modelling, Second EditionB.J.T. MorganBayesian Data Analysis, Second EditionA. Gelman, J.B. Carlin, H.S. Stern,and D.B. RubinBayesian Ideas and Data Analysis: An Introductionfor Scientists and StatisticiansR. Christensen, W. Johnson, A. Branscum,and T.E. HansonBayesian Methods for Data Analysis,Third EditionB.P. Carlin and T.A. LouisBeyond ANOVA — Basics of Applied StatisticsR.G. Miller, Jr.The BUGS Book: A Practical Introduction toBayesian AnalysisD. Lunn, C. Jackson, N. Best, A. Thomas, andD. SpiegelhalterA Course in Categorical Data AnalysisT. LeonardA Course in Large Sample TheoryT.S. FergusonData Driven Statistical MethodsP. SprentDecision Analysis — A Bayesian ApproachJ.Q. SmithDesign and Analysis of Experiments with SASJ. LawsonElementary Applications of Probability Theory,Second EditionH.C. TuckwellElements of SimulationB.J.T. MorganEpidemiology — Study Design andData Analysis, Second EditionM. WoodwardEssential Statistics, Fourth EditionD.A.G. ReesExercises and Solutions in Statistical TheoryL.L. Kupper, B.H. Neelon, and S.M. O’BrienExercises and Solutions in Biostatistical TheoryL.L. Kupper, B.H. Neelon, and S.M. O’BrienExtending the Linear Model with R — GeneralizedLinear, Mixed Effects and Nonparametric RegressionModelsJ.J. FarawayA First Course in Linear Model TheoryN. Ravishanker and D.K. DeyGeneralized Additive Models:An Introduction with RS. WoodGeneralized Linear Mixed Models:Modern Concepts, Methods and ApplicationsW. W. StroupGraphics for Statistics and Data Analysis with RK.J. KeenInterpreting Data — A First Coursein StatisticsA.J.B. AndersonIntroduction to General and GeneralizedLinear ModelsH. Madsen and P. ThyregodAn Introduction to GeneralizedLinear Models, Third EditionA.J. Dobson and A.G. BarnettIntroduction to Multivariate AnalysisC. Chatfield and A.J. CollinsIntroduction to Optimization Methods and TheirApplications in StatisticsB.S. EverittIntroduction to Probability with RK. BaclawskiIntroduction to Randomized Controlled ClinicalTrials, Second EditionJ.N.S. Matthews

Introduction to Statistical Inference and ItsApplications with RM.W. TrossetProblem Solving — A Statistician’s Guide,Second EditionC. ChatfieldIntroduction to Statistical Methods forClinical TrialsT.D. Cook and D.L. DeMetsReadings in Decision AnalysisS. FrenchIntroduction to Statistical Limit TheoryA.M. PolanskyIntroduction to the Theory of Statistical InferenceH. Liero and S. ZwanzigLarge Sample Methods in StatisticsP.K. Sen and J. da Motta SingerLarge Sample Methods in StatisticsP.K. Sen and J. da Motta SingerLinear Algebra and Matrix Analysis for StatisticsS. Banerjee and A. RoyLogistic Regression ModelsJ.M. HilbeMarkov Chain Monte Carlo —Stochastic Simulation for Bayesian Inference,Second EditionD. Gamerman and H.F. LopesMathematical StatisticsK. KnightModeling and Analysis of Stochastic Systems,Second EditionV.G. KulkarniModelling Binary Data, Second EditionD. CollettModelling Survival Data in Medical Research,Second EditionD. CollettMultivariate Analysis of Variance and RepeatedMeasures — A Practical Approach for BehaviouralScientistsD.J. Hand and C.C. TaylorMultivariate Statistics — A Practical ApproachB. Flury and H. RiedwylMultivariate Survival Analysis and Competing RisksM. CrowderPólya Urn ModelsH. MahmoudPractical Data Analysis for Designed ExperimentsB.S. YandellPractical Longitudinal Data AnalysisD.J. Hand and M. CrowderRandomization, Bootstrap and Monte CarloMethods in Biology, Third EditionB.F.J. ManlySampling Methodologies with ApplicationsP.S.R.S. RaoStationary Stochastic Processes: Theory andApplicationsG. LindgrenStatistical Analysis of Reliability DataM.J. Crowder, A.C. Kimber,T.J. Sweeting, and R.L. SmithStatistical Methods for Spatial Data AnalysisO. Schabenberger and C.A. GotwayStatistical Methods for SPC and TQMD. BissellStatistical Methods in Agriculture and ExperimentalBiology, Second EditionR. Mead, R.N. Curnow, and A.M. HastedStatistical Process Control — Theory and Practice,Third EditionG.B. Wetherill and D.W. BrownStatistical Theory: A Concise IntroductionF. Abramovich and Y. RitovStatistical Theory, Fourth EditionB.W. LindgrenStatistics for AccountantsS. LetchfordStatistics for EpidemiologyN.P. JewellStatistics for Technology — A Course in AppliedStatistics, Third EditionC. ChatfieldStatistics in Engineering — A Practical ApproachA.V. MetcalfeStatistics in Research and Development,Second EditionR. CaulcuttStochastic Processes: An Introduction,Second EditionP.W. Jones and P. SmithPractical Multivariate Analysis, Fifth EditionA. Afifi, S. May, and V.A. ClarkSurvival Analysis Using S — Analysis ofTime-to-Event DataM. Tableman and J.S. KimA Primer on Linear ModelsJ.F. MonahanTime Series AnalysisH. MadsenPractical Statistics for Medical ResearchD.G. AltmanPrinciples of UncertaintyJ.B. KadaneProbability — Methods and MeasurementA. O’HaganThe Theory of Linear ModelsB. JørgensenTime Series: Modeling, Computation, and InferenceR. Prado and M. WestUnderstanding Advanced Statistical MethodsP.H. Westfall and K.S.S. Henning

Texts in Statistical ScienceUnderstanding AdvancedStatistical MethodsPeter H. WestfallInformation Systems and Quantitative SciencesTexas Tech University, USAKevin S. S. HenningDepartment of Economics and International BusinessSam Houston State University, USA

CRC PressTaylor & Francis Group6000 Broken Sound Parkway NW, Suite 300Boca Raton, FL 33487-2742 2013 by Taylor & Francis Group, LLCCRC Press is an imprint of Taylor & Francis Group, an Informa businessNo claim to original U.S. Government worksVersion Date: 20130401International Standard Book Number-13: 978-1-4665-1211-5 (eBook - PDF)This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have beenmade to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyrightholders of all material reproduced in this publication and apologize to copyright holders if permission to publish in thisform has not been obtained. If any copyright material has not been acknowledged please write and let us know so we mayrectify in any future reprint.Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from thepublishers.For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. Fororganizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only foridentification and explanation without intent to infringe.Visit the Taylor & Francis Web site athttp://www.taylorandfrancis.comand the CRC Press Web site athttp://www.crcpress.com

ContentsList of Examples. xiiiPreface. xixAcknowledgments. xxiiiAuthors. xxv1. Introduction: Probability, Statistics, and Science.11.1 Reality, Nature, Science, and Models.11.2 Statistical Processes: Nature, Design and Measurement, and Data.31.3 Models.71.4 Deterministic Models.81.5 Variability.91.6 Parameters. 111.7 Purely Probabilistic Statistical Models. 121.8 Statistical Models with Both Deterministic and Probabilistic Components. 161.9 Statistical Inference. 181.10 Good and Bad Models. 201.11 Uses of Probability Models. 24Vocabulary and Formula Summaries.30Exercises. 322. Random Variables and Their Probability Distributions. 372.1 Introduction. 372.2 Types of Random Variables: Nominal, Ordinal, and Continuous. 372.3 Discrete Probability Distribution Functions. 402.4 Continuous Probability Distribution Functions.442.5 Some Calculus—Derivatives and Least Squares. 582.6 More Calculus—Integrals and Cumulative Distribution Functions.65Vocabulary and Formula Summaries. 74Exercises.773. Probability Calculation and Simulation.833.1 Introduction.833.2 Analytic Calculations, Discrete and Continuous Cases.843.3 Simulation-Based Approximation. 863.4 Generating Random Numbers. 87Vocabulary and Formula Summaries.90Exercises. 914. Identifying Distributions. 954.1 Introduction. 954.2 Identifying Distributions from Theory Alone. 964.3 Using Data: Estimating Distributions via the Histogram. 994.4 Quantiles: Theoretical and Data-Based Estimates. 1054.5 Using Data: Comparing Distributions via the Quantile–Quantile Plot. 1084.6 Effect of Randomness on Histograms and q–q Plots. 110vii

viiiContentsVocabulary and Formula Summaries. 113Exercises. 1145. Conditional Distributions and Independence. 1175.1 Introduction. 1175.2 Conditional Discrete Distributions. 1195.3 Estimating Conditional Discrete Distributions. 1215.4 Conditional Continuous Distributions. 1225.5 Estimating Conditional Continuous Distributions. 1245.6 Independence. 125Vocabulary and Formula Summaries. 132Exercises. 1336. Marginal Distributions, Joint Distributions, Independence, and Bayes’Theorem. 1376.1 Introduction. 1376.2 Joint and Marginal Distributions. 1396.3 Estimating and Visualizing Joint Distributions. 1456.4 Conditional Distributions from Joint Distributions. 1476.5 Joint Distributions When Variables Are Independent. 1506.6 Bayes’ Theorem. 153Vocabulary and Formula Summaries. 160Exercises. 1617. Sampling from Populations and Processes. 1657.1 Introduction. 1657.2 Sampling from Populations. 1677.3 Critique of the Population Interpretation of Probability Models. 1727.3.1 Even When Data Are Sampled from a Population. 1727.3.2 Point 1: Nature Defines the Population, Not Vice Versa. 1727.3.3 Point 2: The Population Is Not Well Defined. 1737.3.4 Point 3: Population Conditional Distributions Are Discontinuous. 1737.3.5 Point 4: The Conditional Population Distribution p(y x) Does NotExist for Many x. 1747.3.6 Point 5: The Population Model Ignores Design and MeasurementEffects. 1757.4 The Process Model versus the Population Model. 1827.5 Independent and Identically Distributed Random Variablesand Other Models 1837.6 Checking the iid Assumption. 187Vocabulary and Formula Summaries. 196Exercises. 1988. Expected Value and the Law of Large Numbers. 2018.1 Introduction. 2018.2 Discrete Case. 2018.3 Continuous Case. 2048.4 Law of Large Numbers. 207

Contentsix8.58.6Law of Large Numbers for the Bernoulli Distribution. 214Keeping the Terminology Straight: Mean, Average, Sample Mean,Sample Average, and Expected Value. 2148.7 Bootstrap Distribution and the Plug-In Principle. 216Vocabulary and Formula Summaries. 218Exercises. 2209. Functions of Random Variables: Their Distributions and Expected Values.2239.1 Introduction.2239.2 Distributions of Functions: The Discrete Case.2239.3 Distributions of Functions: The Continuous Case.2259.4 Expected Values of Functions and the Law of the Unconscious Statistician. 2279.5 Linearity and Additivity Properties. 2289.6 Nonlinear Functions and Jensen’s Inequality. 2319.7 Variance. 2359.8 Standard Deviation, Mean Absolute Deviation, and Chebyshev’sInequality. 2399.9 Linearity Property of Variance. 2449.10 Skewness and Kurtosis. 248Vocabulary and Formula Summaries.254Exercises. 25610. Distributions of Totals. 26110.1 Introduction. 26110.2 Additivity Property of Variance. 26110.3 Covariance and Correlation. 26710.4 Central Limit Theorem. 272Vocabulary and Formula Summaries. 277Exercises. 27911. Estimation: Unbiasedness, Consistency, and Efficiency. 28311.1 Introduction. 28311.2 Biased and Unbiased Estimators.28411.3 Bias of the Plug-In Estimator of Variance. 28711.4 Removing the Bias of the Plug-In Estimator of Variance. 29211.5 The Joke Is on Us: The Standard Deviation Estimator Is Biased after All. 29411.6 Consistency of Estimators. 29611.7 Efficiency of Estimators.

With a focus on statistical models as producers of data, the book enables you to more easily understand the machinery of advanced statistics. It also downplays the "population" interpretation of statistical models and presents Bayesian methods before frequentist ones. Requiring no prior calculus experience, the text