Large Scale Matrix Factorization - College Of Computing & Informatics

Transcription

Large Scale MatrixFactorizationFei WangDivision of Health InformaticsDepartment of Healthcare Policy and ResearchWeill Cornell Medical CollegeCornell University12/8/16IEEE Big Data Conference 20161

Outline Introduction Matrix Factorization Technologies Conclusions and Discussions12/8/16IEEE Big Data Conference 20162

What is a matrix?12/8/16IEEE Big Data Conference 20163

Matrix: A Natural Representation forNetworks/Graphs/Relational Data12/8/16IEEE Big Data Conference 20164

Matrices in Social Networks12/8/16IEEE Big Data Conference 20165

Matrices in AnkleEdemaInitial12/8/16FeverExpandedChestPainIEEE Big Data Conference 2016RalesInitial6

Matrices in HealthcareTom, Male, 30HypertensionJohn, Male, 53Heart FailureRoy, Male, 40SepsisSara, Female, 28Asthma12/8/16IEEE Big Data Conference 2016Natasha, Female, 42HyperlipidemiaJack, Male, 30Pneumonia7

Matrices in Healthcare12/8/16IEEE Big Data Conference 20168

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 20169

Matrix Factorization12/8/16IEEE Big Data Conference 201610

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201611

Orthogonal Projection12/8/16IEEE Big Data Conference 201612

Principal Component Analysis12/8/16IEEE Big Data Conference 201613

Principal Component Analysis12/8/16IEEE Big Data Conference 201614

Principal Component Analysis12/8/16IEEE Big Data Conference 201615

Principal Component Analysis12/8/16IEEE Big Data Conference 201616

Principal Component Analysis12/8/16IEEE Big Data Conference 201617

Principal Component Analysis12/8/16IEEE Big Data Conference 201618

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201619

Singular Value Decomposition12/8/16IEEE Big Data Conference 201620

Singular Value Decomposition12/8/16IEEE Big Data Conference 201621

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201622

Nonnegative MatrixFactorization Consider the following problem M 2429 facial imagesEach image of size n 19 by 19 361Matrix V n by m is the original datasetWe want to approximate V by two lower rank matrix W(n by 49) and H (49 by m) V WH Constraints All entries of W and H are non-negative12/8/16IEEE Big Data Conference 201623

Nonnegative MatrixFactorization How well can W and H approximate V How can we interpret the resultDaniel12/8/16D. Lee and H. Sebastian Seung. LearningIEEEtheBigpartsobjects bynon-negative matrix factorization. Nature24401,DataofConference2016788-791 (21 October 1999)

Nonnegative MatrixFactorization Factorizing a nonnegative matrix to the product oftwo low-rank matrices12/8/16IEEE Big Data Conference 201625

Nonnegative MatrixFactorization Multiplicative update methodDaniel D. Lee and H. Sebastian Seung (2001). Algorithms for Non-negative Matrix Factorization. NIPS 2001.12/8/16IEEE Big Data Conference 201626

Nonnegative MatrixFactorization Initialize F and G with nonnegative values Iterate the following procedure: Fixing Fixing, Solve, Solve(1) Projected Gradient: http://www.csie.ntu.edu.tw/ cjlin/nmf/(2) Newtown Type of /software/nnma/index.html(3) Block Principal Pivoting: https://sites.google.com/site/jingukim/nmf bpas.zip?attredirects 0P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of errorestimates of data values. Environmetrics, 5(1):111–126, 1994C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation,19(2007), 2756-2779.D. Kim, S. Sra, I. S. Dhillon, Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem.SDM 2007.12/8/16IEEE Big Data Conference 201627J. Kim and H. Park. Toward Faster Nonnegative Matrix Factorization: A New Algorithm and Comparisons. ICDM 2008.

Nonnegative Matrix Factorization:An Online AlgorithmWang, Fei, Ping Li, and Arnd Christian König. "Efficient Document Clustering via Online Nonnegative MatrixFactorizations." In SDM, vol. 11, pp. 908-919. 2011.12/8/16IEEE Big Data Conference 201628

Online NMF: Updating gNonnegative Least Square (NLS) Active Set Projected Gradient Principal Block PivotingC. L. Lawson and R. J. Hanson. Solving Least Squares Problems.Society for Industrial Mathematics, 1995.C. J. Lin. Projected gradient methods for non-negative matrixfactorization. Neural Computation, 19(10):2756-2779.J. Kim and H. Park. Toward faster nonnegative matrixfactorization: A new algorithm and comparisons. In Proceedingsof the 8th International Conference on Data Mining (ICDM),pages 353-362, 2008.12/8/16IEEE Big Data Conference 201629

Online NMF: Updating Flossgradient1st order projectedgradient descent2nd order projectedgradient descent12/8/16IEEE Big Data Conference 201630

Online NMF: Experiments12/8/1631

NMF with Random ProjectionsSolutionProcedureObjectivesProjected Objectives32

Random NMF: TheoreticalAnalysisR. I. Arriaga and S. Vempala. An algorithmic theory of learning: Robust concepts and random projection. In FOCS, 1999.12/8/16IEEE Big Data Conference 201633

Random NMF: ExperimentsData Dimension: 12600; Data Size: 20312/8/16IEEE Big Data Conference 201634

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201635

Patient EHR MatrixWang, Fei, Noah Lee, Jianying Hu, Jimeng Sun, and Shahram Ebadollahi. "Towards heterogeneoustemporal clinical event pattern discovery: a convolutional approach." In Proceedings of the 18th ACMSIGKDD international conference on Knowledge discovery and data mining, pp. 453-461. ACM, 2012.12/8/16IEEE Big Data Conference 201636

Temporal Patterns12/8/16IEEE Big Data Conference 201637

One-Side Convolution12/8/16IEEE Big Data Conference 201638

One-Side Convolutional NMF12/8/16IEEE Big Data Conference 201639

Multiplicative Updates12/8/16IEEE Big Data Conference 201640

Synthetic Example12/8/16IEEE Big Data Conference 201641

Bag-of-Pattern Representation[1 3 1 1][0 2 1 2][0 3 1 0]42

Prediction of CHF Onset RiskCase: 1,127Control: 3,850PredictionWindow: 180 daysObservationWindow: 360 daysLogisticRegression12/8/16IEEE Big Data Conference 201643

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201644

Matrix Factorization12/8/16IEEE Big Data Conference 201645

SparsityLowerBound46

SparsityRobert Tibshirani. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society, Series B. 1994.12/8/16IEEE Big Data Conference 201647

Sparsity Group Lasso: L1/2 norm Exclusive Lasso: L2/1 norm Elastic Net Regularization Fused Lasso Tree Structured Group LassoSLEP: A Sparse LearningPackagehttp://www.public.asu.edu/ jye02/Software/SLEP/Lukas Meier, Sara Van De Geer, Peter Bühlmann. The group lasso for logistic regression. Journal of the Royal StatisticalSociety: Series B, 70(1), 53–71, 2008.Y. Zhou, R. Jin, and S. C. H. Hoi. Exclusive Lasso for Multi-task Feature Selection. AISTATS 2010.Zou, Hui; Hastie, Trevor. Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society,Series B: 301–320. 2005.R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, K. Knight. Sparsity and smoothness via the fused lasso. Journal of the RoyalStatistical Society: Series B. 67(1), 91–108. 2005.12/8/16IEEE Big Data Conference 201648J. Liu, J. Ye. Moreau-Yosida Regularization for Grouped Tree Structure Learning. NIPS 2010.

Dictionary Learning12/8/16IEEE Big Data Conference 201649

Group Sparse CodingBengio, Samy, et al. "Group sparse coding." Advances in neural information processing systems. 2009.12/8/16IEEE Big Data Conference 201650

Automatic Group Sparse CodingWang, Fei, Noah Lee, Jimeng Sun, Jianying Hu, and Shahram Ebadollahi. "Automatic Group SparseCoding." In AAAI. 2011.12/8/16IEEE Big Data Conference 201651

Synthetic Example12/8/16IEEE Big Data Conference 201652

Synthetic Example12/8/16IEEE Big Data Conference 201653

Outline Introduction Matrix Factorization Technologies Principal Component AnalysisSingular Value DecompositionNonnegative Matrix FactorizationConvolutional Matrix FactorizationRegularized Matrix FactorizationInductive Matrix Factorization Conclusions and Discussions12/8/16IEEE Big Data Conference 201654

Inductive Matrix FactorizationNatarajan, Nagarajan, and Inderjit S. Dhillon. "Inductive matrix completion for predicting gene–diseaseassociations." Bioinformatics 30, no. 12 (2014): i60-i68.12/8/16IEEE Big Data Conference 201655

Inductive Matrix Factorization12/8/16IEEE Big Data Conference 201656

Patient EHR Matrix Wang, Fei, Noah Lee, Jianying Hu, Jimeng Sun, and Shahram Ebadollahi. "Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach." In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 453-461. ACM, 2012. 12/8/16 IEEE Big Data Conference 2016 36