Detection Of Change In The Spatiotemporal Mean Function PDF Free Download

1y ago

18 Views

1 Downloads

1.06 MB

22 Pages

Report/dmca

Download PDF

Transcription

J. R. Statist. Soc. B (2016)Detection of change in the spatiotemporal meanfunctionOleksandr Gromenko,Tulane University, New Orleans, USAPiotr KokoszkaColorado State University, Fort Collins, USAand Matthew ReimherrPennsylvania State University, University Park, USA[Received December 2013. Final revision October 2015]Summary. The paper develops inferential methodology for detecting a change in the annualpattern of an environmental variable measured at fixed locations in a spatial region. Using aframework built on functional data analysis, we model observations as a collection of functionvalued time sequences available at many sites. Each sequence is modelled as an annual meanfunction, which may change, plus a sequence of error functions, which are spatially correlated.The tests statistics extend the cumulative sum paradigm to this more complex setting. Theirasymptotic distributions are not parameter free because of the spatial dependence but canbe effectively approximated by Monte Carlo simulations. The new methodology is applied toprecipitation data. Its finite sample performance is assessed by a simulation study.Keywords: Change point; Functional data; Mean function; Spatiotemporal data1.IntroductionMotivated by the problem of detecting a change in the annual pattern of a climatic variableover a spatial region, we develop change point detection methodology that is applicable tospatially indexed panels of functions. Climatic variables can be measured by using a varietyof techniques and sources, but motivating the present work are data collected at a number ofterrestrial observatories. These observatories often have the capacity to generate data at a veryhigh temporal frequency. It is this mechanism which functional data analysis attempts to exploit:the data can be viewed as curves or functions, not just vectors of unrelated co-ordinates. Forexample, at any speciﬁc location, the record of 365 daily precipitation amounts is naturallyviewed as a noisy curve with some underlying annual pattern. Data at many locations obtainedover several decades form a collection of spatially dependent sequences. Owing to the annualperiodicity, these time series are usually not stationary on small timescales but can be naturallyviewed as stationary time series of annual functions. The present work thus views years as thefundamental statistical unit. We denote a measurement made in year n, at spatial location s S,Address for correspondence: Matthew Reimherr, Department of Statistics, Pennsylvania State University, 411Thomas Building, University Park, PA 16802, USA.E-mail: mreimherr@psu.edu 2016 Royal Statistical Society1369–7412/16/79000

2O. Gromenko, P. Kokoszka and M. Reimherrand time (of year) t T , as Xn .s, t/. The spatiotemporal ﬁeld may have complicated and highlynon-stationary patterns, but, in indexing by n, we exploit the repetition of those patterns fromyear to year.The goal of the present work is to develop statistical methods for inferring whether thepattern of a spatiotemporal process has changed over a speciﬁc time period, which is severaldecades for the data that motivated this research, e.g. the industrial revolution, the introductionof hybrid cars or international agreements which place restrictions on greenhouse gases. Sucha perspective motivates us to use a change point detection framework. An alternative andcomplementary approach is to consider trend detection and estimation. Change point analysishas been recognized as a useful paradigm, which continues to beneﬁt from new approaches, e.g.Fryzlewicz (2014) and Cho and Fryzlewicz (2015). Developing change point tools for functionaldata has been undertaken only in the last few years; Horváth et al. (2010), Berkes et al. (2009),Hörmann and Kokoszka (2010), Zhang et al. (2011) and Aston and Kirch (2012a,b). Severalchapters of Horváth and Kokoszka (2012) provide an account of this research.At present, we are unaware of any research which has explored the change point problemfor spatially indexed functional data. Our methodology uses the proven cumulative sum approach as a starting point. Although many competing methods are now available for scalardata, the fundamental idea of comparing the value of a parameter (the mean function in ourcase) before and after all potential change points remains attractive. In the absence of anychange point methodology for the data structure that we study, and because of the lack of aparametric model on which likelihood methods could be based, this seems to be a productiveapproach. The main challenge in this project turned out to be the estimation of the spatiotemporal structure of a functional random ﬁeld in the presence of potential change pointsin the temporal mean structure. This problem can be approached from many directions; wepresent a solution which has worked best for the data that motivate this research. To estimate the spatiotemporal covariance structure, we exploit the replications (one per year) of thespatial error ﬁeld. In this, our methodology differs from purely spatial analysis, where onlya single replication is available. The derivation of the tests statistics requires computation ofspatiotemporal moments. The practical implementation of the test relies on a new calibrationapproach which utilizes feasible approximations of these moments under the assumption ofGaussianity.To illustrate the new methodology, we examine daily precipitation patterns at 59 locationsin 12 states in the midwestern USA, over the course of 60 years. Since we view each yearas a statistical unit, we have a sample size of N 60, with 59 spatial locations, and 365 timepoints. The dimension of the problem is, therefore, substantially larger than the sample size. Ourmethodology exploits the underlying periodic structure of data to make the problem tractable.Such a data structure is common to many weather, environmental and ecological data sets, andit is hoped that our methodology will ﬁnd applications in these ﬁelds.The remainder of the paper is organized as follows. Section 2 sets up the statistical frameworkby introducing the requisite concepts and assumptions. The test statistics and their asymptotic distributions are presented in Section 3. Section 4 contains the details of the implementation of the tests, which are further illustrated in Section 5 by application to midwest precipitation data. In Section 5, we also study the ﬁnite sample properties of the tests. Proofsof the results of Section 3, technical calculations and details of the estimation of the covariance structure are collected respectively in Appendices A, B and C. Our methods are implemented in an R package, scpt, which, along with example code, is available to download from http://scpt.r-forge.r-project.org, or through R by using the commandinstall.packages("scpt",rep "http://R-Forge.R-project.org").

Detection of Change in the Spatiotemporal Mean Function2.3Notation and assumptionsLet Xn .sk ; ti / be the observation of some environmental variable taken at the spatial location sk ,in month or day ti of year n. We denote by K the number of spatial locations, by T the number ofobservations per year and by N the number of years that are available for the analysis. We treatthe whole record Xn .sk ; ti /, 1 i T , as a single functional observation. To lighten notationand to reﬂect the functional perspective, we shall often drop the index t, so Xn .sk / representsthe entire temporal record in year n and at location sk . The functional perspective reﬂects thefact that the values Xn .sk ; t/ (temperature, pollution, etc.) exist at any time t. As with otherfunctional data analysis tools and procedures, we formulate our assumptions in term of thecomplete functions Xn .sk /.Assumption 1. Each Xn is a random ﬁeld over S T , where S is a compact subset of Rd andT is a compact subset of R. We assume that Xn L2 .S T / and that EXn2 .s; t/ dtds :Under assumption 1, Xn can be expressed asXn .s; t/ μn .s; t/ "n .s; t/,.2:1/where μn L2 .S T / is a deterministic mean function and "n L2 .S T / is a zero meanrandom-error term. Our goal is to test the null hypothesis that the sequences of mean functions{μn } are the same across n. If this hypothesis is true, the long-term annual pattern over a regiondoes not change. The random functions "n describe weather patterns which change from yearto year. The testing problem can thus be stated asH0 : μ1 : : : μN versus HA : μ1 : : : μnÆ μnÆ 1 : : : μN ,for some 1 nÅ N. (The equalities are in the L2 .S T / sense.)The next two assumptions specify the dependence structure of the data.Assumption 2. The errors "1 , "2 , : : : , "N are independent and identically distributed randomﬁelds on S T .In the case of a single series of scalar observations, assumption 2 would mean that we testfor a change point in mean, assuming that other aspects of the distribution, in particular thevariance, do not change. Such an assumption is quite common. If both mean and variance areallowed to change, the problem becomes much more difﬁcult, even in the case of scalar normalobservations; see Horváth (1993).Our next assumption postulates the separability of the spatiotemporal covariance. Throughout the paper, we use Kronecker’s delta function: 1if n m,δnm 0if n m:Assumption 3. The covariance function of Xn factors ascov{Xn .sk ; t/, Xm .sl ; t /} E["n .sk ; t/"m .sl ; t /] δnm C.t, t / σ.sk , sl /,where C.t, t / and σ.sk , sl / are respectively purely temporal and spatial covariances.Although separability can be criticized as an excessively strong assumption (see for exampleStein (2005)), it is often found acceptable and useful in both theoretical and applied research;

4O. Gromenko, P. Kokoszka and M. Reimherrsee Haas (1995), Genton (2007), Hoff (2011) and Paul and Peng (2011). Also, separability playsa key role in modelling very large data sets since it signiﬁcantly reduces the computational timethat is required for inverting space–time covariance matrices; Sun et al. (2012). In our setting,it also leads to asymptotic distributions which are functionals of standard Gaussian processesand so can be readily simulated.Our implementation of the test procedure allows the variances to change between locationsbut assumes that the correlation function is stationary and isotropic. We do not state herestationarity and isotropy as explicit assumptions, as our methods are designed for a variety ofassumptions and techniques for estimating σ.·, ·/. This is discussed further in Appendix C.Since C.t, t / and σ.sk , sl / are determined up to only multiplicative constants, we impose theidentiﬁability condition C.t, t/dt 1:.2:2/T.s/ L2 .TBy assumption 1, Xn/ for almost all s S, so without any loss of generality weassume that each function Xn .s/ is an element of L2 .T /. By the spectral theorem, the covariancefunction C admits the representation C.t, t / λi vi .t/vi .t /,i 1where λ1 λ2 : : : 0 are the eigenvalues of C and the vi are the functional principal components; see for example chapter 2 of Horváth and Kokoszka (2012). Each function Xn .s/ admitsthe corresponding Karhunen–Loève expansion Xn .s; t/ μn .s; t/ ξni .s/vi .t/,.2:3/i 1where ξni .s/ are the scores deﬁned by ξni .s/ {Xn .s; t/ μn .s; t/}vi .t/dt:We make one last assumption for technical convenience. We discuss the estimation of thespatiotemporal covariance structure in Appendix C. However, large sample justiﬁcation canbe derived for any estimators which are consistent. Assumption 4 refers to the K K spatialcovariance matrixΣ [σ.sk , sl /, 1 k, l K]:Assumption 4. The estimators Ĉ and Σ̂ are such that, as N , P{Ĉ.t, s/ C.t, s/}2 dt ds 0,P Σ̂ Σ 0:Furthermore, assume that the eigenvalues of C are distinct: λ1 λ2 λ3 : : :.Assumption 4 is very weak as most estimators are root N consistent under mild conditions;see for example chapter 2 of Horváth and Kokoszka (2012). However, it allows us to establishthe asymptotic results of our testing procedure for a variety of estimators. The distinctness ofthe eigenvalues ensures that the empirical eigenvalues and eigenfunctions of Ĉ are consistent,and it is a common assumption in functional data analysis.

Detection of Change in the Spatiotemporal Mean Function53. Tests statistics and their asymptotic distributionBy assumptions 2 and 3, the vi and the λi do not depend on n or sk . Denote by v̂i and by λ̂itheir estimates. To detect a change point, we propose the test statistics r 2pKNN 1 r 1 Λ̂1 2ŵ.k/Xn .sk / Xn .sk /, v̂i ,.3:1/λ̂iN n 1N k 1i 1r 1 n 1andp KN 1 Λ̂2 2ŵ.k/N k 1i 1 r 1 r Nr Xn .sk / Xn .sk /, v̂iN n 1n 1 2:.3:2/In the case of a single location (K 1; ŵ.1/ 1), statistic (3.1) reduces to the test statistic ofBerkes et al. (2009). In the spatial setting, we introduce the random weights ŵ.k/ which reﬂectthe intuition that spatially close records contribute some redundant information and so shouldbe given smaller weights, whereas records at isolated locations contribute more informationand should be given larger weights. We allow the weights to be random to reﬂect that they areusually estimated from the data. Statistics Λ̂1 and Λ̂2 are weighted sums of Cramér–von Misestype of functionals of the functional cumulative sum process. Kolmogorov–Smirnov type ofstatistics can be deﬁned analogously, but it is well known that such tests generally have poorﬁnite sample properties owing to the slow convergence of maximally selected statistics to thedouble-exponential distribution; see for example Horváth et al. (1999) and Bugni et al. (2009),among many other contributions. We therefore focus on the Cramér–von Mises type of statisticsΛ̂1 and Λ̂2 . The selection of the weights is discussed at the end of this section. Observe that fora p which explains a large fraction of the variance r 22pNrN r r Xn .sk / Xn .sk /, v̂i Xn .sk / Xn .sk / ,N n 1N n 1i 1 n 1n 1by Parceval’s identity. Using this fact we introduce one more test statistic based directly on theL2 -norm: Λ̂2 KNrN 1 r ŵ.k/X.s/ Xn .sk /nkN n 1N 2 k 1r 1 n 12:.3:3/The inner products in Λ̂1 are normalized with the estimated variances of the scores λ̂i , whichleads to an asymptotic distribution that is free of the λi . Only the largest p eigenvalues λ̂i are used so as not to inﬂate the variability. The asymptotic distribution of Λ̂2 depends on the λi , but thestatistic does not require the selection of p. In the spatial setting, both limit distributions dependon the spatial covariances; unlike the limit in Berkes et al. (2009), none of them is parameterfree. The incorporation and estimation of the spatial covariance structure in the presence of apossible change point in the temporal functional structure has not been addressed in existingresearch.The asymptotic null distributions of the test statistics are presented in theorem 1.Theorem 1. Suppose that assumptions 1–4 hold. Furthermore, assume that the weights ŵ.k/are chosen such that ŵ.k/ P w.k/ where w.k/ are deterministic real-valued weights. Then,under hypothesis H0 ,p 1K D2Λ̂1 Λ1 w.k/Bik.x/dx,.3:4/k 1i 10

6O. Gromenko, P. Kokoszka and M. ReimherrDΛ̂2 Λ2 K k 1 DΛ̂2 Λ 2 p w.k/K λii 1w.k/k 1 i 110 λi2Bik.x/dx,10.3:5/2Bik.x/dx,.3:6/where Bik are Brownian bridges, independent across i. For each i, the vector of Brownianbridges Bi .Bi1 , : : : , BiK /T has covariance matrix Σ, i.e. Σ 1 2 Bi is a vector of independentstandard Brownian bridges.Theorem 2 describes the behaviour of the test statistics under the alternative of a single changepoint. Extensions to multiple change points or epidemic alternatives are possible but are moretechnical and are not pursued here. In the context of a single functional time series, the issuesthat are related to the behaviour under HA of similar test statistics have been well explained inAston and Kirch (2012a,b).Theorem 2. Suppose that assumptions 1–4 hold. Furthermore, assume that the weightsŵ.k/ are chosen such that ŵ.k/ P w.k/ where w.k/ are deterministic real-valued weights.Abusing the notation slightly, let μnÆ μ1 , μnÆ 1 μ2 and Δ μ1 μ2 0. Under HA , ifnÅ N θ .0, 1/ and Δ.sk /, vi 0 for some 1 i p and 1 k K, thenΛ̂1 NK w.k/k 1p i 1λ 1Δ.sk /, vii2 .1 θ/2 θ23P oP .N/ andΛ̂2 NK k 1w.k/p Δ.sk /, vii 12 .1 θ/32 θ2P oP .N/ , with an analogous statement for Λ̂2 .The proofs of theorems 1 and 2 are provided in Appendix A. If a change point has beendetected, its location can be estimated byK r Nr w.k/r̂ arg maxXn .sk / Xn .sk /N n 11 r N k 1n 12:.3:7/We have thus far assumed arbitrary weights ŵ.k/. We shall now explore the choices that arein some sense optimal. In spatial statistics, the weights are typically selected to minimize theexpected squared distance to an unknown object of interest, be it the value at an unknownlocation or a parameter of the model. In a testing setting, the most natural approach is tominimize the variance of the test statistic under hypothesis H0 . The asymptotic variances of thelimits in theorem 1 can be shown to be proportional towT Σ2 w K w.k/w.l/σ 2 .sk , sl /,k,l 1whereΣ2 [σ 2 .sk , sl /, 1 k, l N]:.3:8/(Note that the entries of Σ are σ.sk , sl / and those of Σ2 are their squares.) Using the method of

Detection of Change in the Spatiotemporal Mean Function7Lagrange multipliers, it is not difﬁcult to verify that the weights w minimizing wT Σ2 w subjectto wT 1 ΣKk 1 w.k/ 1 are given byw .Σ2 / 1 1:T1 .Σ2 / 1 1.3:9/The estimation of the covariances σ.sk , sl / is addressed in Appendix C.4.Description of the test proceduresIn this section, we provide algorithmic descriptions of the tests based on the asymptotic resultsof Section 3. An R implementation is available in the package scpt. To implement the tests,we must estimate the spatial covariance matrix Σ and the temporal eigenelements λi and vi .The latter can be calculated once an estimate Ĉ of the temporal covariance kernel C is available.There are several ways to compute Σ̂ and Ĉ. The key difﬁculty is to obtain estimates which arevalid, at least approximately, under both H0 and HA . We explored several approaches and usedthe method that is described in Appendix C for the ﬁnal analysis. It uses B-splines for estimatingσ.·, ·/, where we assume that, whereas the variance from location to location might be different,the correlation function is stationary and isotropic. This allows us to pool across locations toobtain a very good estimator.We also performed many simulations aimed at comparing the performance of the tests based on statistics Λ̂1 , Λ̂2 and Λ̂2 . The test based on Λ̂2 (which does not require the selection ofthe optimal number p of functional principal components) tends to overreject. If it does notdetect a change point, we can be fairly sure that there is none. If it does detect a change point,a more careful analysis based on the test statistics Λ̂1 or Λ̂2 which require a careful selection of p (calibration) is recommended. We thus ﬁrst describe the test based on Λ̂2 , which does notrequire calibration, and then proceed with the description of the calibration method. Algorithm 1 (test based on the statistic Λ̂2 ).Step 1: calculate the differenced series (C.1).Step 2: estimate the spatial covariances by using formulae (C.7)–(C.9) followed by parametricor non-parametric smoothing.Step 3: calculate the weights by using expressions (C.6) and (C.11).Step 4: rescale (if necessary) T to the unit interval [0, 1] and estimate the temporal covariancefunction by using expression (C.10).Step 5: calculate the eigenfunctions and the eigenvalues of the covariance function (C.10).Step 6: using the estimates that were obtained in the previous steps, calculate the Monte Carlodistribution of 1KT N2Λ2 ŵ.k/ λ̂iBik.x/ dx:k 1i 101 20 , B0 , : : : , B0 /T , and the B0 are independent standardFor each i, Bi Σ̂ Bi0 , where Bi0 .Bi1iKiki20Brownian bridges. The vectors Bi are independent. Step 7: calculate the statistic Λ̂2 given by equation (3.3). Find the P-value by using the MonteCarlo distribution that was found in the previous step.We now proceed with the description of the tests which adaptively choose the value of pleading to the correct empirical size. They require that the scores are approximately multivariatenormal. Justiﬁcation of expression (4.1) is based on lemma 1 in Appendix A.

8O. Gromenko, P. Kokoszka and M. ReimherrAlgorithm 2 (tests which employ calibration).Step 1: perform steps 1–5 of algorithm 1.Step 2: conduct simulations to determine the optimal number of functional principal components as follows. For every 1 p T , repeat the following steps R times.(a) Generate artiﬁcial data according toXn .sk ; t/ T λ̂i ξni .sk / v̂i .t/1 n N,1 k K,.4:1/i 1where ξni NK .0, Σ̂/ are independent vectors in an array indexed by .n, i/. All quantitieswith circumﬂexes are estimates obtained for the original data.(b) Check whether Λ̂i .p/ qi .p, α/, i 1, 2, where qi .p, α/ is the .1 α/th quantile of theMonte Carlo distribution of Λi .p/; see expressions (3.4) and (3.5).Step 3: ﬁnd p for which the number of rejections in the previous step is closest to the nominallevel of signiﬁcance α; denote this value by popt (which depends on i 1, 2 and α).Step 4: calculate the statistic Λ̂1 .popt / given by expression (3.1) or Λ̂2 .popt / given by expression(3.2). Find the P-value by using the Monte Carlo distribution of Λ1 .popt / or Λ2 .popt /.The calibration algorithm is computationally intensive. We need to generate the Monte Carlodistribution of Λi .p/, the right-hand sides in expression (3.4) and (3.5), for each value of p. Weused 104 replications to do it and we used R 104 as well. Algorithm 2 was applied in Section5 by using the Rcpp package; Eddelbuettel and François (2013). Using our R package for thisprocedure, the application took approximately 6 min on an Intel 2.6-GHz Quad core i7 centralprocessor unit. The procedure runs almost instantaneously on a smaller, simulated data set thatis included in the package.The nominal level of signiﬁcance is achieved only for the calibrated procedure without screen ing with the test based on λ̂2 . The power of the test depends slightly on the level of smoothing,but the size of the calibrated test does not depend on it. In principle, the matrix in equation(3.9) may be close to singular. This has not happened in our simulations or data example. Possible remedies include choosing a more restrictive parametric spatial covariance model and/orincreasing the size of the nugget.5.Application to precipitation dataWe illustrate the procedures that were described in Section 4 by applying them to historicalmidwest precipitation records. Our objective is not to perform a detailed climatological analysis,but rather to illustrate chief aspects of the statistical methodology.The data have been obtained from the ‘Global historical climatological network database’,which is one of the main data sets used for climate monitoring. This data set represents compiled high quality collections of main climate parameters such as daily maximum and minimumtemperature, amount of precipitation (liquid equivalent), amount of snowfall and snow depth.The data set, as well as extensive documentation, is available from the National Oceanic andAtmospheric Administration: ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/. The dataselected are daily precipitation records from midwest states Illinois, Indiana, Iowa, Kansas,Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota and Wisconsin; Fig. 1. The number of stations is K 59, and the time interval is 1941–2000. The locationsand the time interval are selected on the basis of the availability of data. We use only recordswhich contain no more than 20 missing observations in a row and less then 5% of total missing

Detection of Change in the Spatiotemporal Mean FunctionFig. 1.9Locations of the 59 stations selectedobservations. To remove the effects due to the heavy tail distribution, we apply the transformationXn .s; t/ log10 {Yn .s; t/ 1},where Yn .s; t/ are original records in tenths of a millimetre. After the transformation, we presmooth data by using the cubic splines function smooth.spline in R; Fig. 2. For our precipitation data set and simulated data, the conclusions of the test were not affected by thispresmoothing step; they were the same for several degrees of smoothing which visually preserved the general shape of the curves.The application of the test of Gabrys and Kokoszka (2007) to residuals obtained by subtracting the sample mean functions before and after the estimated change point shows that thecurves Xn can be treated as independent. Time series plots of the scores of the residuals conﬁrmthat the assumption of constant variance is reasonable.Detection of a change point in the amount of precipitation has been recognized as an important problem in climatology and environmental science. Gallagher et al. (2012) have reviewedFig. 2. Transformed raw precipitation record for a single year n and a single location s ( ) and smoothedprecipitation Xn .sI t/ ()

10O. Gromenko, P. Kokoszka and M. Reimherrrecent research and pointed out that the usual approach is to work with aggregated annual precipitation, which results in one scalar data point per year. In our approach, we work with onefunction per year. We are concerned not only with a change in the total annual precipitation, butalso with a potential change in the timing of the precipitation, reﬂected in a change of the meanprecipitation pattern. Moreover, we are concerned with a change taking place over a region, notjust at a single location. Such a perspective can be relevant as shifts in a precipitation pattern overan agriculturally important region have profound economic consequences. We note that the annual curves X.sk / that we consider are different from the Canadian precipitation curves that wereextensively used in Ramsay and Silverman (2005); we work with a time series of curves at eachlocation, whereas Ramsay and Silverman (2005) used one curve at every location, the averageover several decades. By construction, our approach has the maximum change point detectionresolution of 1 year; Gallagher et al. (2012) proposed a method for daily data at a single location.The results of the application of the procedures described in Section 4 are presented in Tables1, 2 and 3. All tests lead to the same conclusion: one change point in the second half of the 1960s,though the test based on Λ̂1 is not quite signiﬁcant at a 5% level. The pattern of the change isshown in Fig. 3. The biggest changes are in the area around Michigan Lake and south-west of thelake. To visualize the spatial distribution of change displayed in Fig. 3, we used the spatial ﬁeldφ̂.s/ μˆ1 .s/ μ̂2 .s/ ,Table 1.Iteration123Table 2.Iteration123Table 3.Iteration123.5:1/Results of the test described in algorithm 1 97P-value Estimated changepoint year0:00130:09870:13541966——Results of the test described in algorithm 2, based on statistics Λ̂1Segment(years)DecisionCumulativevariance (%)pP-valueEstimated changepoint �—Results of the test described in algorithm 2, based on statistics Λ̂2Segment(years)DecisionCumulativevariance (%)pP-valueEstimated changepoint �—

Detection of Change in the Spatiotemporal Mean Function11Fig. 3. Spatial field showing the L2 -distance between the mean log-precipitation before and after 1966:there is an increase in precipitation throughout the year in the area around location 4 and a decrease in thefirst half of the year in the area around location 1; locations close to 2 and 3 do not show a large change nora consistent patternwhereμ̂1 .s; t/ r̂ 1rˆ Xn .s; t/,n 1μ̂2 .s; t/ .N r̂/ 1N n r 1ˆXn .s; t/:We performed ordinary kriging with the exponential covariance model to obtain the heat mapthat is shown in Fig. 3. The application of our tests conﬁrms that the heat map shows a statistically signiﬁcant change over a region, not a variation in the magnitude of change which may bedue to chance. Fig. 4 illustrates the ﬁt of the model and the typical mean precipitation curvesbefore and after the change point. Our tests show that the spatially indexed vectors of the meanfunctions before and after around 1966 are different.When the test of Berkes et al. (2009) is applied to records at individual locations, no changepoints are detected. If the curves are smoothed by using a penalty, change points at two locationsare detected but are not signiﬁcant after a multiple-testing correction. Our test combines manyweak individual signals to detect a change over a region with enhanced power.We now discuss some aspects of the implementation of the tests. The distribution of the scoresis normal to a reasonable approximation, which is a point that is exploited in the simulationstudy that is described in what follows. To take into account the curvature of the Earth we usechordal distance which is the three-dimensional Euclidean distance, so any covariance functionthat is valid in the three-dimensional Euclidean space remains valid in our application. Fig. 5shows the behaviour of the statistic T2 .r/ that is deﬁned byT2 .r/ K k 1ŵ.k/r n 1Xn .sk / Nr Xn .sk /N n 1which is used to identify the change point via expression (3.7).2,.5:2/

1.00.00.5Xn (s;t)1.00.00.5Xn (s;t)1.5O. Gromenko, P. Kokoszka and M. .005T2(r)0.0150.020Fig. 4. Examples of functional observations and mean curves at a single location in southern Illinois(, functional observations Xn .tI s/;, curves modelled

Tulane University, New Orleans, USA Piotr Kokoszka Colorado State University, Fort Collins, USA and Matthew Reimherr Pennsylvania State University, University Park, USA [Received December 2013. Final revision October 2015] Summary. The paper develops inferential methodology for detecting a change in the annual pattern of an environmental variable measured at ﬁxed locations in a spatial .