Econometric Analysis Of Cross Section And Panel Data - IPC-IG

Transcription

Econometric Analysis of Cross Section and Panel DataJe rey M. WooldridgeThe MIT PressCambridge, MassachusettsLondon, England

ON AND BACKGROUND111.11.2IntroductionCausal Relationships and Ceteris Paribus AnalysisThe Stochastic Setting and Asymptotic Analysis1.2.1 Data Structures1.2.2 Asymptotic AnalysisSome ExamplesWhy Not Fixed Explanatory onditional Expectations and Related Concepts in EconometricsThe Role of Conditional Expectations in EconometricsFeatures of Conditional Expectations2.2.1 Definition and Examples2.2.2 Partial E ects, Elasticities, and Semielasticities2.2.3 The Error Form of Models of Conditional Expectations2.2.4 Some Properties of Conditional Expectations2.2.5 Average Partial E ectsLinear ProjectionsProblemsAppendix 2A2.A.1 Properties of Conditional Expectations2.A.2 Properties of Conditional Variances2.A.3 Properties of Linear Projections1313141415181922242729293132Basic Asymptotic TheoryConvergence of Deterministic SequencesConvergence in Probability and Bounded in ProbabilityConvergence in DistributionLimit Theorems for Random SamplesLimiting Behavior of Estimators and Test Statistics3.5.1 Asymptotic Properties of Estimators3.5.2 Asymptotic Properties of Test StatisticsProblems353536383940404345

viContentsIILINEAR MODELS4744.14.2The Single-Equation Linear Model and OLS EstimationOverview of the Single-Equation Linear ModelAsymptotic Properties of OLS4.2.1 Consistency4.2.2 Asymptotic Inference Using OLS4.2.3 Heteroskedasticity-Robust Inference4.2.4 Lagrange Multiplier (Score) TestsOLS Solutions to the Omitted Variables Problem4.3.1 OLS Ignoring the Omitted Variables4.3.2 The Proxy Variable–OLS Solution4.3.3 Models with Interactions in UnobservablesProperties of OLS under Measurement Error4.4.1 Measurement Error in the Dependent Variable4.4.2 Measurement Error in an Explanatory al Variables Estimation of Single-Equation Linear ModelsInstrumental Variables and Two-Stage Least Squares5.1.1 Motivation for Instrumental Variables Estimation5.1.2 Multiple Instruments: Two-Stage Least SquaresGeneral Treatment of 2SLS5.2.1 Consistency5.2.2 Asymptotic Normality of 2SLS5.2.3 Asymptotic E‰ciency of 2SLS5.2.4 Hypothesis Testing with 2SLS5.2.5 Heteroskedasticity-Robust Inference for 2SLS5.2.6 Potential Pitfalls with 2SLSIV Solutions to the Omitted Variables and Measurement ErrorProblems5.3.1 Leaving the Omitted Factors in the Error Term5.3.2 Solutions Using Indicators of the UnobservablesProblems105105105107Additional Single-Equation TopicsEstimation with Generated Regressors and Instruments115115

Contents6.26.377.17.27.37.47.57.67.7vii6.1.1 OLS with Generated Regressors6.1.2 2SLS with Generated Instruments6.1.3 Generated Instruments and RegressorsSome Specification Tests6.2.1 Testing for Endogeneity6.2.2 Testing Overidentifying Restrictions6.2.3 Testing Functional Form6.2.4 Testing for HeteroskedasticitySingle-Equation Methods under Other Sampling Schemes6.3.1 Pooled Cross Sections over Time6.3.2 Geographically Stratified Samples6.3.3 Spatial Dependence6.3.4 Cluster SamplesProblemsAppendix imating Systems of Equations by OLS and GLSIntroductionSome ExamplesSystem OLS Estimation of a Multivariate Linear System7.3.1 Preliminaries7.3.2 Asymptotic Properties of System OLS7.3.3 Testing Multiple HypothesesConsistency and Asymptotic Normality of Generalized LeastSquares7.4.1 Consistency7.4.2 Asymptotic NormalityFeasible GLS7.5.1 Asymptotic Properties7.5.2 Asymptotic Variance of FGLS under a StandardAssumptionTesting Using FGLSSeemingly Unrelated Regressions, Revisited7.7.1 Comparison between OLS and FGLS for SUR Systems7.7.2 Systems with Cross Equation Restrictions7.7.3 Singular Variance Matrices in SUR 63164167167

viii7.888.18.28.38.48.58.699.19.29.39.4ContentsThe Linear Panel Data Model, Revisited7.8.1 Assumptions for Pooled OLS7.8.2 Dynamic Completeness7.8.3 A Note on Time Series Persistence7.8.4 Robust Asymptotic Variance Matrix7.8.5 Testing for Serial Correlation and Heteroskedasticity afterPooled OLS7.8.6 Feasible GLS Estimation under Strict ExogeneityProblems169170173175175176178179System Estimation by Instrumental VariablesIntroduction and ExamplesA General Linear System of EquationsGeneralized Method of Moments Estimation8.3.1 A General Weighting Matrix8.3.2 The System 2SLS Estimator8.3.3 The Optimal Weighting Matrix8.3.4 The Three-Stage Least Squares Estimator8.3.5 Comparison between GMM 3SLS and Traditional 3SLSSome Considerations When Choosing an EstimatorTesting Using GMM8.5.1 Testing Classical Hypotheses8.5.2 Testing Overidentification RestrictionsMore E‰cient Estimation and Optimal 99199201202205Simultaneous Equations ModelsThe Scope of Simultaneous Equations ModelsIdentification in a Linear System9.2.1 Exclusion Restrictions and Reduced Forms9.2.2 General Linear Restrictions and Structural Equations9.2.3 Unidentified, Just Identified, and Overidentified EquationsEstimation after Identification9.3.1 The Robustness-E‰ciency Trade-o 9.3.2 When Are 2SLS and 3SLS Equivalent?9.3.3 Estimating the Reduced Form ParametersAdditional Topics in Linear SEMs209209211211215220221221224224225

510.6ixUsing Cross Equation Restrictions to Achieve IdentificationUsing Covariance Restrictions to Achieve IdentificationSubtleties Concerning Identification and E‰ciency in LinearSystemsSEMs Nonlinear in Endogenous Variables9.5.1 Identification9.5.2 EstimationDi erent Instruments for Di erent EquationsProblems225227Basic Linear Unobserved E ects Panel Data ModelsMotivation: The Omitted Variables ProblemAssumptions about the Unobserved E ects and ExplanatoryVariables10.2.1 Random or Fixed E ects?10.2.2 Strict Exogeneity Assumptions on the ExplanatoryVariables10.2.3 Some Examples of Unobserved E ects Panel Data ModelsEstimating Unobserved E ects Models by Pooled OLSRandom E ects Methods10.4.1 Estimation and Inference under the Basic Random E ectsAssumptions10.4.2 Robust Variance Matrix Estimator10.4.3 A General FGLS Analysis10.4.4 Testing for the Presence of an Unobserved E ectFixed E ects Methods10.5.1 Consistency of the Fixed E ects Estimator10.5.2 Asymptotic Inference with Fixed E ects10.5.3 The Dummy Variable Regression10.5.4 Serial Correlation and the Robust Variance MatrixEstimator10.5.5 Fixed E ects GLS10.5.6 Using Fixed E ects Estimation for Policy AnalysisFirst Di erencing Methods10.6.1 Inference10.6.2 Robust Variance 7262263264265265269272274276278279279282

x10.71111.111.211.311.411.5Contents10.6.3 Testing for Serial Correlation10.6.4 Policy Analysis Using First Di erencingComparison of Estimators10.7.1 Fixed E ects versus First Di erencing10.7.2 The Relationship between the Random E ects and FixedE ects Estimators10.7.3 The Hausman Test Comparing the RE and FE EstimatorsProblems282283284284More Topics in Linear Unobserved E ects ModelsUnobserved E ects Models without the Strict ExogeneityAssumption11.1.1 Models under Sequential Moment Restrictions11.1.2 Models with Strictly and Sequentially ExogenousExplanatory Variables11.1.3 Models with Contemporaneous Correlation between SomeExplanatory Variables and the Idiosyncratic Error11.1.4 Summary of Models without Strictly ExogenousExplanatory VariablesModels with Individual-Specific Slopes11.2.1 A Random Trend Model11.2.2 General Models with Individual-Specific SlopesGMM Approaches to Linear Unobserved E ects Models11.3.1 Equivalence between 3SLS and Standard Panel DataEstimators11.3.2 Chamberlain’s Approach to Unobserved E ects ModelsHausman and Taylor-Type ModelsApplying Panel Data Methods to Matched Pairs and 15315317322322323325328332IIIGENERAL APPROACHES TO NONLINEAR nIdentification, Uniform Convergence, and ConsistencyAsymptotic Normality341341345349

13.613.713.8xiTwo-Step M-Estimators12.4.1 Consistency12.4.2 Asymptotic NormalityEstimating the Asymptotic Variance12.5.1 Estimation without Nuisance Parameters12.5.2 Adjustments for Two-Step EstimationHypothesis Testing12.6.1 Wald Tests12.6.2 Score (or Lagrange Multiplier) Tests12.6.3 Tests Based on the Change in the Objective Function12.6.4 Behavior of the Statistics under AlternativesOptimization Methods12.7.1 The Newton-Raphson Method12.7.2 The Berndt, Hall, Hall, and Hausman Algorithm12.7.3 The Generalized Gauss-Newton Method12.7.4 Concentrating Parameters out of the Objective FunctionSimulation and Resampling Methods12.8.1 Monte Carlo Simulation12.8.2 9371372372374375376377377378380Maximum Likelihood MethodsIntroductionPreliminaries and ExamplesGeneral Framework for Conditional MLEConsistency of Conditional MLEAsymptotic Normality and Asymptotic Variance Estimation13.5.1 Asymptotic Normality13.5.2 Estimating the Asymptotic VarianceHypothesis TestingSpecification TestingPartial Likelihood Methods for Panel Data and Cluster Samples13.8.1 Setup for Panel Data13.8.2 Asymptotic Inference13.8.3 Inference with Dynamically Complete Models13.8.4 Inference under Cluster 409

xii13.9ContentsPanel Data Models with Unobserved E ects13.9.1 Models with Strictly Exogenous Explanatory Variables13.9.2 Models with Lagged Dependent VariablesTwo-Step MLEProblemsAppendix 13A410410412413414418Generalized Method of Moments and Minimum Distance EstimationAsymptotic Properties of GMMEstimation under Orthogonality ConditionsSystems of Nonlinear EquationsPanel Data ApplicationsE‰cient Estimation14.5.1 A General E‰ciency Framework14.5.2 E‰ciency of MLE14.5.3 E‰cient Choice of Instruments under Conditional MomentRestrictionsClassical Minimum Distance EstimationProblemsAppendix 14A421421426428434436436438IVNONLINEAR MODELS AND RELATED TOPICS4511515.115.215.315.4Discrete Response ModelsIntroductionThe Linear Probability Model for Binary ResponseIndex Models for Binary Response: Probit and LogitMaximum Likelihood Estimation of Binary Response IndexModelsTesting in Binary Response Index Models15.5.1 Testing Multiple Exclusion Restrictions15.5.2 Testing Nonlinear Hypotheses about b15.5.3 Tests against More General AlternativesReporting the Results for Probit and LogitSpecification Issues in Binary Response Models15.7.1 Neglected Heterogeneity15.7.2 Continuous Endogenous Explanatory 472

416.516.616.716.8xiiiA Binary Endogenous Explanatory VariableHeteroskedasticity and Nonnormality in the LatentVariable Model15.7.5 Estimation under Weaker AssumptionsBinary Response Models for Panel Data and Cluster Samples15.8.1 Pooled Probit and Logit15.8.2 Unobserved E ects Probit Models under Strict Exogeneity15.8.3 Unobserved E ects Logit Models under Strict Exogeneity15.8.4 Dynamic Unobserved E ects Models15.8.5 Semiparametric Approaches15.8.6 Cluster SamplesMultinomial Response Models15.9.1 Multinomial Logit15.9.2 Probabilistic Choice ModelsOrdered Response Models15.10.1 Ordered Logit and Ordered Probit15.10.2 Applying Ordered Probit to Interval-Coded DataProblems477Corner Solution Outcomes and Censored Regression ModelsIntroduction and MotivationDerivations of Expected ValuesInconsistency of OLSEstimation and Inference with Censored TobitReporting the ResultsSpecification Issues in Tobit Models16.6.1 Neglected Heterogeneity16.6.2 Endogenous Explanatory Variables16.6.3 Heteroskedasticity and Nonnormality in the LatentVariable Model16.6.4 Estimation under Conditional Median RestrictionsSome Alternatives to Censored Tobit for Corner SolutionOutcomesApplying Censored Regression to Panel Data and Cluster Samples16.8.1 Pooled Tobit16.8.2 Unobserved E ects Tobit Models under Strict 540

318.4Contents16.8.3 Dynamic Unobserved E ects Tobit ModelsProblems542544Sample Selection, Attrition, and Stratified SamplingIntroductionWhen Can Sample Selection Be Ignored?17.2.1 Linear Models: OLS and 2SLS17.2.2 Nonlinear ModelsSelection on the Basis of the Response Variable: TruncatedRegressionA Probit Selection Equation17.4.1 Exogenous Explanatory Variables17.4.2 Endogenous Explanatory Variables17.4.3 Binary Response Model with Sample SelectionA Tobit Selection Equation17.5.1 Exogenous Explanatory Variables17.5.2 Endogenous Explanatory VariablesEstimating Structural Tobit Equations with Sample SelectionSample Selection and Attrition in Linear Panel Data Models17.7.1 Fixed E ects Estimation with Unbalanced Panels17.7.2 Testing and Correcting for Sample Selection Bias17.7.3 AttritionStratified Sampling17.8.1 Standard Stratified Sampling and Variable ProbabilitySampling17.8.2 Weighted Estimators to Account for Stratification17.8.3 Stratification Based on Exogenous VariablesProblems551551552552556Estimating Average Treatment E ectsIntroductionA Counterfactual Setting and the Self-Selection ProblemMethods Assuming Ignorability of Treatment18.3.1 Regression Methods18.3.2 Methods Based on the Propensity ScoreInstrumental Variables Methods18.4.1 Estimating the ATE Using 596598603603603607608614621621

Contents18.51919.119.219.319.419.519.6xv18.4.2 Estimating the Local Average Treatment E ect by IVFurther Issues18.5.1 Special Considerations for Binary and Corner SolutionResponses18.5.2 Panel Data18.5.3 Nonbinary Treatments18.5.4 Multiple TreatmentsProblems633636Count Data and Related ModelsWhy Count Data Models?Poisson Regression Models with Cross Section Data19.2.1 Assumptions Used for Poisson Regression19.2.2 Consistency of the Poisson QMLE19.2.3 Asymptotic Normality of the Poisson QMLE19.2.4 Hypothesis Testing19.2.5 Specification TestingOther Count Data Regression Models19.3.1 Negative Binomial Regression Models19.3.2 Binomial Regression ModelsOther QMLEs in the Linear Exponential Family19.4.1 Exponential Regression Models19.4.2 Fractional Logit RegressionEndogeneity and Sample Selection with an Exponential RegressionFunction19.5.1 Endogeneity19.5.2 Sample SelectionPanel Data Methods19.6.1 Pooled QMLE19.6.2 Specifying Models of Conditional Expectations withUnobserved E ects19.6.3 Random E ects Methods19.6.4 Fixed E ects Poisson Estimation19.6.5 Relaxing the Strict Exogeneity 76678

xvi2020.120.220.320.420.5ContentsDuration AnalysisIntroductionHazard Functions20.2.1 Hazard Functions without Covariates20.2.2 Hazard Functions Conditional on Time-InvariantCovariates20.2.3 Hazard Functions Conditional on Time-VaryingCovariatesAnalysis of Single-Spell Data with Time-Invariant Covariates20.3.1 Flow Sampling20.3.2 Maximum Likelihood Estimation with Censored FlowData20.3.3 Stock Sampling20.3.4 Unobserved HeterogeneityAnalysis of Grouped Duration Data20.4.1 Time-Invariant Covariates20.4.2 Time-Varying Covariates20.4.3 Unobserved HeterogeneityFurther Issues20.5.1 Cox’s Partial Likelihood Method for the ProportionalHazard Model20.5.2 Multiple-Spell Data20.5.3 Competing Risks 694695700703706707711713714714714715715721737

AcknowledgmentsMy interest in panel data econometrics began in earnest when I was an assistantprofessor at MIT, after I attended a seminar by a graduate student, Leslie Papke,who would later become my wife. Her empirical research using nonlinear panel datamethods piqued my interest and eventually led to my research on estimating nonlinear panel data models without distributional assumptions. I dedicate this text toLeslie.My former colleagues at MIT, particularly Jerry Hausman, Daniel McFadden,Whitney Newey, Danny Quah, and Thomas Stoker, played significant roles in encouraging my interest in cross section and panel data econometrics. I also havelearned much about the modern approach to panel data econometrics from GaryChamberlain of Harvard University.I cannot discount the excellent training I received from Robert Engle, CliveGranger, and especially Halbert White at the University of California at San Diego. Ihope they are not too disappointed that this book excludes time series econometrics.I did not teach a course in cross section and panel data methods until I startedteaching at Michigan State. Fortunately, my colleague Peter Schmidt encouraged meto teach the course at which this book is aimed. Peter also suggested that a text onpanel data methods that uses ‘‘vertical bars’’ would be a worthwhile contribution.Several classes of students at Michigan State were subjected to this book in manuscript form at various stages of development. I would like to thank these students fortheir perseverance, helpful comments, and numerous corrections. I want to specificallymention Scott Baier, Linda Bailey, Ali Berker, Yi-Yi Chen, William Horrace, RobinPoston, Kyosti Pietola, Hailong Qian, Wendy Stock, and Andrew Toole. Naturally,they are not responsible for any remaining errors.I was fortunate to have several capable, conscientious reviewers for the manuscript.Jason Abrevaya (University of Chicago), Joshua Angrist (MIT), David Drukker(Stata Corporation), Brian McCall (University of Minnesota), James Ziliak (University of Oregon), and three anonymous reviewers provided excellent suggestions,many of which improved the book’s organization and coverage.The people at MIT Press have been remarkably patient, and I have very muchenjoyed working with them. I owe a special debt to Terry Vaughn (now at PrincetonUniversity Press) for initiating this project and then giving me the time to produce amanuscript with which I felt comfortable. I am grateful to Jane McDonald andElizabeth Murry for reenergizing the project and for allowing me significant leewayin crafting the final manuscript. Finally, Peggy Gordon and her crew at P. M. GordonAssociates, Inc., did an expert job in editing the manuscript and in producing thefinal text.

PrefaceThis book is intended primarily for use in a second-semester course in graduateeconometrics, after a first course at the level of Goldberger (1991) or Greene (1997).Parts of the book can be used for special-topics courses, and it should serve as ageneral reference.My focus on cross section and panel data methods—in particular, what is oftendubbed microeconometrics—is novel, and it recognizes that, after coverage of thebasic linear model in a first-semester course, an increasingly popular approach is totreat advanced cross section and panel data methods in one semester and time seriesmethods in a separate semester. This division reflects the current state of econometricpractice.Modern empirical research that can be fitted into the classical linear model paradigm is becoming increasingly rare. For instance, it is now widely recognized that astudent doing research in applied time series analysis cannot get very far by ignoringrecent advances in estimation and testing in models with trending and strongly dependent processes. This theory takes a very di erent direction from the classical linear model than does cross section or panel data analysis. Hamilton’s (1994) timeseries text demonstrates this di erence unequivocally.Books intended to cover an econometric sequence of a year or more, beginningwith the classical linear model, tend to treat advanced topics in cross section andpanel data analysis as direct applications or minor extensions of the classical linearmodel (if they are treated at all). Such treatment needlessly limits the scope of applications and can result in poor econometric practice. The focus in such books on thealgebra and geometry of econometrics is appropriate for a first-semester course, butit results in oversimplification or sloppiness in stating assumptions. Approaches toestimation that are acceptable under the fixed regressor paradigm so prominent in theclassical linear model can lead one badly astray under practically important departures from the fixed regressor assumption.Books on ‘‘advanced’’ econometrics tend to be high-level treatments that focus ongeneral approaches to estimation, thereby attempting to cover all data configurations—including cross section, panel data, and time series—in one framework, without givingspecial attention to any. A hallmark of such books is that detailed regularity conditions are treated on par with the practically more important assumptions that haveeconomic content. This is a burden for students learning about cross section andpanel data methods, especially those who are empirically oriented: definitions andlimit theorems about dependent processes need to be included among the regularityconditions in order to cover time series applications.In this book I have attempted to find a middle ground between more traditionalapproaches and the more recent, very unified approaches. I present each model and

xviiiPrefacemethod with a careful discussion of assumptions of the underlying population model.These assumptions, couched in terms of correlations, conditional expectations, conditional variances and covariances, or conditional distributions, usually can be givenbehavioral content. Except for the three more technical chapters in Part III, regularityconditions—for example, the existence of moments needed to ensure that the centrallimit theorem holds—are not discussed explicitly, as these have little bearing on applied work. This approach makes the assumptions relatively easy to understand, whileat the same time emphasizing that assumptions concerning the underlying populationand the method of sampling need to be carefully considered in applying any econometric method.A unifying theme in this book is the analogy approach to estimation, as expositedby Goldberger (1991) and Manski (1988). [For nonlinear estimation methods withcross section data, Manski (1988) covers several of the topics included here in a morecompact format.] Loosely, the analogy principle states that an estimator is chosen tosolve the sample counterpart of a problem solved by the population parameter. Theanalogy approach is complemented nicely by asymptotic analysis, and that is the focushere.By focusing on asymptotic properties I do not mean to imply that small-sampleproperties of estimators and test statistics are unimportant. However, one typicallyfirst applies the analogy principle to devise a sensible estimator and then derives itsasymptotic properties. This approach serves as a relatively simple guide to doinginference, and it works well in large samples (and often in samples that are not solarge). Small-sample adjustments may improve performance, but such considerationsalmost always come after a large-sample analysis and are often done on a case-bycase basis.The book contains proofs or outlines the proofs of many assertions, focusing on therole played by the assumptions with economic content while downplaying or ignoringregularity conditions. The book is primarily written to give applied researchers a veryfirm understanding of why certain methods work and to give students the backgroundfor developing new methods. But many of the arguments used throughout the bookare representative of those made in modern econometric research (sometimes withoutthe technical details). Students interested in doing research in cross section or paneldata methodology will find much here that is not available in other graduate texts.I have also included several empirical examples with included data sets. Most ofthe data sets come from published work or are intended to mimic data sets used inmodern empirical analysis. To save space I illustrate only the most commonly usedmethods on the most common data structures. Not surprisingly, these overlap con-

Prefacexixsiderably with methods that are packaged in econometric software programs. Otherexamples are of models where, given access to the appropriate data set, one couldundertake an empirical analysis.The numerous end-of-chapter problems are an important component of the book.Some problems contain important points that are not fully described in the text;others cover new ideas that can be analyzed using the tools presented in the currentand previous chapters. Several of the problems require using the data sets that areincluded with the book.As with any book, the topics here are selective and reflect what I believe to be themethods needed most often by applied researchers. I also give coverage to topics thathave recently become important but are not adequately treated in other texts. Part Iof the book reviews some tools that are elusive in mainstream econometrics books—in particular, the notion of conditional expectations, linear projections, and variousconvergence results. Part II begins by applying these tools to the analysis of singleequation linear models using cross section data. In principle, much of this materialshould be review for students having taken a first-semester course. But starting withsingle-equation linear models provides a bridge from the classical analysis of linearmodels to a more modern treatment, and it is the simplest vehicle to illustrate theapplication of the tools in Part I. In addition, several methods that are used oftenin applications—but rarely covered adequately in texts—can be covered in a singleframework.I approach estimation of linear systems of equations with endogenous variablesfrom a di erent perspective than traditional treatments. Rather than begin with simultaneous equations models, we study estimation of a general linear system by instrumental variables. This approach allows us to later apply these results to modelswith the same statistical structure as simultaneous equations models, includingpanel data models. Importantly, we can study the generalized method of momentsestimator from the beginning and easily relate it to the more traditional three-stageleast squares estimator.The analysis of general estimation methods for nonlinear models in Part III beginswith a general treatment of asymptotic theory of estimators obtained from nonlinear optimization problems. Maximum likelihood, partial maximum likelihood,and generalized method of moments estimation are shown to be generally applicableestimation approaches. The method of nonlinear least squares is also covered as amethod for estimating models of conditional means.Part IV covers several nonlinear models used by modern applied researchers.Chapters 15 and 16 treat limited dependent variable models, with attention given to

xxPrefacehandling certain endogeneity problems in such models. Panel data methods for binaryresponse and censored variables, including some new estimation approaches, are alsocovered in these chapters.Chapter 17 contains a treatment of sample selection problems for both cross section and panel data, including some recent advances. The focus is on the case wherethe population model is linear, but some results are given for nonlinear models aswell. Attrition in panel data models is also covered, as are methods for dealing withstratified samples. Recent approaches to estimating average treatment e ects aretreated in Chapter 18.Poisson and related regression models, both for cross section and panel data, aretreated in Chapter 19. These rely heavily on the method of quasi-maximum likelihood estimation. A brief but modern treatment of duration models is provided inChapter 20.I have given short shrift to some important, albeit more advanced, topics. Thesetting here is, at least in modern parlance, essentially parametric. I have not includeddetailed treatment of recent advances in semiparametric or nonparametric analysis.In many cases these topics are not conceptually di‰cult. In fact, many semiparametricmethods focus primarily on estimating a finite dimensional parameter in the presenceof an infinite dimensional nuisance parameter—a feature shared by traditional parametric methods, such as nonlinear least squares and partial maximum likelihood.It is estimating infinite dimensional parameters that is conceptually and technicallychallenging.At the appropriate point, in lieu of treating semiparametric and nonparametricmethods, I mention when such extensions are possible, and I provide references. Abenefit of a modern approach to parametric models is that it provides a seamlesstransition to semiparametric and nonparametric methods. General surveys of semiparametric and nonparametric methods are available in Volume 4 of the Handbookof Econometrics—see Powell (1994) and Härdle and Linton (1994)—as well as inVolume 11 of the Handbook of Statistics—see Horowitz (1993) and Ullah and Vinod(1993).I only briefly treat simulation-based methods of estimation and inference. Computer simulations can be used to estimate complicated nonlinear models when traditional optimization methods are ine ective. The bootstrap method of inference andconfidence interval construction can improve on asymptotic analysis. Volume 4 ofthe Handbook of Econometrics and Volume 11 of the Handbook of Statistics containnice surveys of these topics (Hajivassilou and Ruud, 1994; Hall, 1994; Hajivassilou,1993; and Keane, 1993).

PrefacexxiOn an organizational note, I refer to sections throughout the book first by chapternumber followed by section number and, sometimes, subsection number. Therefore,Section 6.3 refers to Section 3 in Chapter 6, and Section 13.8.3 refers to Subsection 3of Section 8 in Chapter 13. By always including the chapter number, I hope tominimiz

10.2.3 Some Examples of Unobserved E ects Panel Data Models 254 10.3 Estimating Unobserved E ects Models by Pooled OLS 256 10.4 Random E ects Methods 257 10.4.1 Estimation and Inference under the Basic Random E ects Assumptions 257 10.4.2 Robust Variance Matrix Estimator 262 10.4.3 A General FGLS Analysis 263