Non-Gaussian Methods For Learning Linear Structural Equation Models .

Transcription

2010/7/8UAI 2010 Tutorial, Catalina IslandOutline of Part IISome recent advances in LiNGAM analysis:Non-Gaussian Methods forLearning Linear StructuralEquation Models (Part II)1. LiNGAM combined with time-series models– AR-LiNGAM (Hyvarinen et al., 2010)– ARMA-LiNGAMARMA LiNGAM (Kawahara et al.,al 2010)2. LiNGAM with latent confounders– lvLiNGAM (Hoyer et al., 2006)– GroupLiNGAM (Kawahara et al, 2010)Shohei Shimizu and Yoshinobu KawaharaOsaka UniversityTime-series analysis with LiNGAMLiNGAM combined withtime-series models How useful is it to analyze time-series data usingnon-Gaussianity of data? Instantaneous effects can be incorporated explicitlyinto account through LiNGAM analysis combinedwith classical time-series models:– AR-LiNGAM (Hyvarinen et al.,2010)– ARMA-LiNGAM (Kawahara et al.,2010)Instantaneous and lagged effectsAutoregressive modelsdisturbance Represent the current statewith the past states:– First order :– Second order :Lagged effect Instantaneous effectIf time-resolution of measurements is sufficiently high, these effects can be caught by estimating classical timeseries models, such as AR and ARMA models. Otherwise, how to deal with instantaneous effects ?– p-th order :Usually, assumed to be white-noisesAn AR model is one of the standard tools for analyzing timeseries data and has been successfully applied in a variety offields, such as economics (Mills,1990, Perceival & Andrew,1993).1

2010/7/8Incorporating instantaneous effects Estimation (1/2)Introduce the instantaneous term into AR models(AR-LiNGAM) (Hyvarinen et al.,2010):Relation between two models:AR-model:AR-LiNGAM:i 1, , p (AR-model) i 0, 1, , p (AR-LiNGAM)Regression Coef.:How to estimate the model includinginstantaneous effects ? 1. Assume thatis non-Gaussian.2. Apply LiNGAM analysis.Disturbance:This is a SEM with non-Gaussian external influences.Estimation (2/2)Extension to ARMA model (1/2)1. Estimate a multivariate AR model (i.e.,calculate.) and then2. Apply LiNGAM analysis to the estimatedand calculate the matrix3. Using the estimatedAR-LiNGAM through ., calculate the parameters ofExtension to ARMA model (2/2)The analogous relationships between ARMA models andARMA-LiNGAM models still hold (Kawahara et al.,2010):ARMA-LiNGAM:AR-model– can estimate apparent effects or power-spectrum,– but cannot express direct relationships between variablesin principle.:.ARMA-model: ARMA (Autoregressive moving-average) model:– More general representation for time-series data (exactrepresentation of linear differential equations in discretetime-domain)(An AR-model is an asymptotic expansion of an ARMA-model.)Connection to Granger causality (1/2)Suppose a multivariate process().is partitioned intoGranger causality* (Granger,1976, Boudjellaba,1992):The processesdo not cause the processif and only if,Regression Coef.:for all.: Past sequence up to time t,: Variance of prediction error ofDisturbance:Again, this is a SEM with non-Gaussian external influences.*) Granger causality is not necessarily a natural extension of the causality for i.i.d.data, which is usually defined based on the counter-factual model.2

2010/7/8Connection to Granger causality (2/2)Application to real dataDuplex-pendulum system:ARMA model:G. C.Rad (Boudjellaba,1992)(Hyvarinen et al.,2010,al 2010 Kawahara et alal.,2010)2010)ARMA-LiNGAM:Chaotic pattern Analytic model for the duplex-pendulum system: Time[s]If the order in the sense of Granger causality completely agreeswith the instantaneous effects, then the order is preserved even ifthe instantaneous effects are neglected.Application to physical system (cont.)Summary (LiNGAM combined with time-series models)Estimated lagged effects by AR- and ARMA-LiNGAM:AR-LiNGAMtt-1t Non-Gaussianity could be useful for analyzing time-seriesdata (AR-LiNGAM and ARMA-LiNGAM).– Instantaneous effects can be taken into account by usingnon-Gaussianity of disturbances. AR-LiNGAM (or ARMA-LiNGAM) is identified by firstestimating a classical AR model (or ARMA-model) andthen applying LiNGAM analysis on disturbance sequences. The order in the sense of Granger causality is satisfied ifand only if both of the instantaneous and lagged effects inAR-LiNGAM (or ARMA-LiNGAM) give the same order.ARMA-LiNGAMt-2tt-1tt-2Although dominant patterns are captured by both models,the chaotic effect is captured only by ARMA-LiNGAM.Latent confounder Independent external influences (the assumption in LiNGAM) No latent confounder (Spirtes et al., 2000)LiNGAM with LatentConfoundersLatent variable which is a parent ofmore than two observed variablesLatentconfounder A latent confounder induces dependency among externalinfluences:3

2010/7/8Motivation of this topic Latent variable LiNGAMActual data might include latent confounders. In the case, the assumption on LiNGAM that thereis no latent confounders has been violated. Introduce latent confounders f to LiNGAM model: Overcomplete ICA(Lewicki & Sejnowski 2000, Eriksson & Koivunen 2004)How to overcome this ?non-Gaussian and independent– IvLiNGAM (Hoyer et al.,06) Overcomplete ICA. – GroupLiNGAM (Kawahara et al.,10)A Extension of the principle of DirectLiNGAM to ‘set’.How to classify f and e? and how to assign fi ?Basic idea of lvLiNGAMFind an external influence1. Remove external influences.2. Find a pair of observed variables that has noobserved parents.– Mark their common parent as a latent confounder.– The existence of such a pair is guaranteed by theassumption that “no latent confounder that hastotal effects to some observed variable and itsdescendants only.”Mixing matrixj-th col.The j-th element of s is an external influence.3. Repeat 1-2.Find a latent confounderEmpirical example1. If the j-th row vector ‘covers’ i-the row one:ex.) The i-th row vector of A hasnon-zero at the j-th column andall zeros elsewhere:non-zeroelementis a parent of(Two different networks, which has the same orderingof variables, were estimated in this case.).2. If the i-th and j-th row vectors do not cover each other:ex.) andhave no order.If the i-th row vector ‘covers’ no other rows,has no observed parents.Original networkEstimated network4

2010/7/8Motivation of this topicBasic idea of GroupLiNGAM Actual data might include latent confounders. In the case, the assumption on LiNGAM that thereis no latent confounders has been violated. How to overcome this ?– IvLiNGAM (Hoyer et al.,06) Overcomplete ICA.– GroupLiNGAM (Kawahara et al.,10) DirectLiNGAM Variable ordering is estimated by iteratively findingan exogenous variable. GroupLiNGAMp Group ordering (i.e., ordering of sets of variables) isestimated by recursively finding an exogenous set(defined later). Extension of the principle of DirectLiNGAM to ‘set’.Applicable to data with latent confoundersExogenous set Exogenous set (cont.)Let the partition of variablesbe.The subset of variablesis said to be exogenousagainst, if the corresponding partition of the matrixB has the following form:Lemma: a set of variablesis exogenous if and onlyifis independent of its residual.: Residual whenis regressed on. This lemma extends DirectLiNGAM to the ‘set’ case.ex.)is independent of0.is not independent of. Group ordering: {1,2} {3}Identification of an exogenous setFind an exogenous set Find a subset of variables that isindependent of the residualsEstimation (1/2)Zero-one structure of matrix B still holds in SEMs of theexogenous set and the residuals, respectively.0Find a subsetwhere(U is the set of variables) s.t.stis a small real number.Some independence measure I, such as mutual information(Kraskov et al.,2004) and HSIC (Gretton et al.,2005), is used.exogenous set{1,2} {3} {4,5}residuals0(S {1,2,3})5

2010/7/8Estimation (2/2)Application to sociology (1/2)Status attainment model based on domain knowledge*:Thus, the group ordering can be found by recursivelyfinding a partition of variables into an exogenous set andthe rest of the variables until no further partition is found.Father’sEducation{1, 2}Number ofSibilings{4, 5}Son’sOccupation{1, 3, 6} {5} {4} {2}{3}{1,2} {3} {4,5}*) Dataset is obtained from sociological data repository, General Social Survey(www.norc.org/GSS Website/)Associated graphApplication to sociology (2/2)Comparative results by ICA-LiNGAM, DirectLiNGAM andGroupLiNGAM ( is omitted because data is not well constructed):Domain knowledge: {1, 3, 6} {5} {4} ( {2}) ICA-LiNGAM:ICA LiNGAMSon sSon’sIncomeFather sFather’sOccupation{1 2{1,2, 33, 44, 5}{1, 2, 3}Son’sEducation{5} {6} {3} {1} {4}Summary (LiNGAM with Latent Confounders) Although LiNGAM analysis assume no latent confounder,it is often violated in practice. We introduced two approaches that allow latent variablesin LiNGAM analysis.– lvLiNGAM: (Overcomplete ICA)– GroupLiNGAM: (Extension of DirectLiNGAM to ‘set’ case) DirectLiNGAM: {6} {1} {3} {4} {5} GroupLiNGAM: {6} {1,3} {5} {4}(and the mutual information is used as an independence measure.)GroupLiNGAM seems to give a reasonable solution.Summary We introduced some recent advances in LiNGAM analysis:– LiNGAM combined with time-series models (AR-LiNGAM(Hyvarinen et al., 2010), ARMA-LiNGAM (Kawahara et al.,2010))– LiNGAM with latent variables (lvLiNGAM (Hoyer et al.,2006),GroupLiNGAM (Kawahara et al.,2010))which is expected to make LiNGAM analysis be moreapplicable in practice. These are just small samples and there could exist severalnew directions of future researche on LiNGAM analysis.6

Estimated lagged effects by AR- and ARMA-LiNGAM: AR-LiNGAM ARMA-LiNGAM Although dominant patterns are captured by both models, the chaotic effect is captured only by ARMA-LiNGAM. t t-1 t t-2 t t-1 t t-2 Summary (LiNGAM combined with time-series models) Non-Gaussianity could be useful for analyzing time-series data (AR-LiNGAM and ARMA-LiNGAM).