Machine Learning Interpretability Techniques In Credit Risk Modeling

Transcription

Machine LearningInterpretability Techniques in Credit Risk ModelingCraig PetersSr Dir-Research Model ValidationNovember 2019

Craig PetersSenior Director - Research Model ValidationMoody's Analytics

Research MotivationML algorithms improve prediction accuracy over traditional statistical modelsMachine Learning Interpretability Techniques in Credit Risk Modeling3

Research MotivationML algorithms are often criticized as black-box modelsThis is a cat: It has fur, whiskers, and clawsIt has this feature:This is a cat.Black-box ExplanationExplainable AI (XAI)ExplanationMachine Learning Interpretability Techniques in Credit Risk Modeling4

Agenda1. Problem SettingDatasetGeneralized Additive Model (GAM) vs XGBoost (XGB)2. Global InterpretabilityFeature ImportanceFeature EffectFeature InteractionAlternate GAM Model3. Local InterpretabilityLocal Interpretable Model-agnostic Explanation (LIME)Shapley value4. Take-aways & QuestionsMachine Learning Interpretability Techniques in Credit Risk Modeling5

1Problem SettingA Probability of Default Model

Problem SettingDatasetCategoryRatio NameA03Ratio DescriptionA08Current Liabilities to SalesA18Change in Working Capital over SalesDC01**EBITDA over Interest ExpenseGROW01**Sales Growth: Sales(t)/Sales(t-1) – 1GROW04Change in ROALEV12**Retained Earnings to Current LiabilitiesLEV13**LT Debt to (LT Debt plus Net worth)LiquidityLIQ05**Cash and Marketable Securities to Total AssetsProfitabilityPFT01**ROA/ Net Income to Total AssetsSizeSIZE01**Total AssetsSectorSECTOR14 SectorsDUMDEFPDDefault flag (1 default)ActivityDebt CoverageGrowthLeverageInventories to Sales** feature of interest (to be covered later)** important features (to be covered later)Machine Learning Interpretability Techniques in Credit Risk Modeling7

Problem SettingData ProcessingRaw DataTrain/Test SplitMissing ImputationTransformation(LOESS / Smoothing)Transformed dataMachine Learning Interpretability Techniques in Credit Risk Modeling8

Problem SettingMethodology and ResultsGAMXGBA generalized linearmodel (GLM) ontransformed predictors(𝑇𝑇𝑖𝑖 𝑥𝑥𝑖𝑖 )Ensemble treemethodology involvingboth bagging andboosting𝑇𝑇𝑖𝑖 𝑥𝑥𝑖𝑖 : Loess transformation𝑃𝑃𝑃𝑃 𝜙𝜙(𝛽𝛽0 𝛽𝛽1 𝑇𝑇1 𝑥𝑥1 𝛽𝛽𝑁𝑁 𝑇𝑇𝑁𝑁 𝑥𝑥𝑁𝑁 )AR: 0.579 (train)0.575 (test)3 pts.AR: 0.700 (train)0.605 (test)Machine Learning Interpretability Techniques in Credit Risk Modeling9

2GlobalInterpretability

Global InterpretabilityFeature ImportancePermutation TestFeature InteractionFriedman’s H-statisticFeature EffectsPartial DependencePlots (PDP)Accumulated LocalEffects (ALE)Alternate GAM modelSplinesInteractionsMachine Learning Interpretability Techniques in Credit Risk Modeling 11

Global InterpretabilityFeature ImportanceMost importantfeaturesPermutation Testproduce thelargest differencePermute feature(s)Change in ARRank featuresin ARMachine Learning Interpretability Techniques in Credit Risk Modeling 12

Global InterpretabilityImportance decreasesFeature Importance: Permutation Test The top 5 important features (LIQ05, DC01, GROW01, . .) are the same SIZE01 becomes more important in XGB vs GAM More area is covered by bar chart in XGB vs GAMMachine Learning Interpretability Techniques in Credit Risk Modeling 13

Global InterpretabilityFeature Effects: Partial Dependence Plot (PDP)PDP shows the marginal/partial effect of feature(s) on the predicted outcome.Machine Learning Interpretability Techniques in Credit Risk Modeling 14

Global InterpretabilityFeature Effects: Partial Dependence Plot (PDP)Analogous behavior (GAM and XGB)LIQ05, LEV13 were among the top important (also common) features for both GAM and XGBMachine Learning Interpretability Techniques in Credit Risk Modeling 15

Global InterpretabilityFeature Effects: Partial Dependence Plot (PDP)Non-analogous behavior (GAM vs XGB)Size01 becomes more important in XGB, and A03 has higher AR drop from permutation test in XGBMachine Learning Interpretability Techniques in Credit Risk Modeling 16

Global InterpretabilityFeature Effects: Accumulated Local Effect (ALE)PDP ignores correlationsamong featuresALE solves this problemALEPDP1.Conditional distribution1.Marginal distribution2.Considers correlation of features2.Less informative if features are correlated3.Global view of sum of local effects3.Global view of global effectsMachine Learning Interpretability Techniques in Credit Risk Modeling 17

Global InterpretabilityFeature Effects: PDP vs. ALE--XGBPDP and ALE show different effects of average PD changes in response to changes in PFT01Machine Learning Interpretability Techniques in Credit Risk Modeling 18

Global InterpretabilityFeature InteractionFriedman’s H-statisticTwo WayInteraction of twovariables (at a time)H-statAll WayInteraction of onevariable with rest ofvariablesMachine Learning Interpretability Techniques in Credit Risk Modeling 19

Global InterpretabilityFeature Interaction: Friedman’s H-statistic******** **All Way: Strong interaction of PFT01, DC01, LIQ05, GROW01 with rest of variablesTwo way: Pairwise SIZE01:PFT01, PFT01:DC01, LIQ05:GROW01. . . strong interaction observedMachine Learning Interpretability Techniques in Credit Risk Modeling 20

Global InterpretabilityAlternate GAM ModelImportanceEffectsPermutation testPDP and ALEInteractionsModel PerformanceH-statGAM vs XGBAlternate GAM modelMachine Learning Interpretability Techniques in Credit Risk Modeling 21

Global InterpretabilityAlternate GAM Model: Non-linearitiesOriginal GAMAR: 0.579 (train)0.575 (test)Original GAM SplinesAR: 0.589 (train)0.584 (test)1 pts.Machine Learning Interpretability Techniques in Credit Risk Modeling 22

Global InterpretabilityAlternate GAM Model: InteractionsOriginal GAMAR: 0.579 (train)0.575 (test)Original GAM SplinesAR: 0.589 (train)0.584 (test)1 pts.Original GAM Splines InteractionsAR: 0.594 (train)0.589 (test)1.5 pts.Machine Learning Interpretability Techniques in Credit Risk Modeling 23

3LocalInterpretability

Local InterpretabilityLocal Interpretable Model-agnostic Explanations (LIME) Simulate points near specific observationGenerate model predictions at these pointsUse model predictions as Y variableWeight new observations by proximityBuild weighted linear regression (or other interpretable model)Interpret the local surrogate modelAdvantage: Conceptually Intuitive Easy to interpretDisadvantage: Simulating “good” nearby points Unstable resultsMolnar, C. (2018)observation of interestMachine Learning Interpretability Techniques in Credit Risk Modeling 25

Local InterpretabilityLIME ExampleFirm 860.631.772.401.05BusinessServicesMachine Learning Interpretability Techniques in Credit Risk Modeling 26

Local InterpretabilityShapley ValueOriginally from game theory to attribute the value of a teameffort to individual membersUnlike LIME, uses the same original model in a local space.Explains: Individual vs. Average PD Feature contribution towards the differenceMachine Learning Interpretability Techniques in Credit Risk Modeling 27

Local InterpretabilityShapley Value ExampleFirm 860.631.772.401.05BusinessServicesMachine Learning Interpretability Techniques in Credit Risk Modeling 28

4Take-aways

Take-aways1Interpretability techniques can help explain and predict black boxmodel output2Model-Agnostic methods can be applied to any model enabling abroader range of methodologies3Interpretability techniques can help make today’s black boxestomorrow’s interpretable models4The tradeoff between interpretability and accuracy is real and canonly be mitigatedMachine Learning Interpretability Techniques in Credit Risk Modeling 30

ReferencesPermutation feature importanceBreiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.Shapley ValuePartial Dependency Plot (PDP)Štrumbelj, E., & Kononenko, I. (2014). Explaining prediction modelsand individual predictions with feature contributions. Knowledge andinformation systems, 41(3), 647-665.Friedman, J. H. (2001). Greedy function approximation: a gradient boostingmachine. Annals of statistics, 1189-1232.Lundberg, S., & Lee, S. I. (2016). An unexpected unity among methodsfor interpreting model predictions. arXiv preprint arXiv:1611.07478.Machine Learning Interpretability with H2O Driverless AI (K-LIME)Accumulated Local Effect (ALE)Apley, D. W. (2016). Visualizing the effects of predictor variables in black boxsupervised learning models. arXiv preprint arXiv:1612.08468.Friedman’s H-statisticsFriedman, J. H., & Popescu, B. E. (2008). Predictive learning via ruleensembles. The Annals of Applied Statistics, 2(3), 916-954.Local interpretable model-agnostic explanations (LIME)Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). Why should i trust you?:Explaining the predictions of any classifier. In Proceedings of the 22nd ACMSIGKDD international conference on knowledge discovery and data mining (pp.1135-1144). ACM.Patrick Hall, Navdeep Gill, Megan Kurka, & Wen Phan, Edited by:Angela BartzDiscussions of Machine Learning Interpretation framework andtaxonomyLipton, Z. C. (2016). The mythos of model interpretability. arXivpreprint arXiv:1606.03490.Gilpin, L. H., Bau, D., Yuan, B. Z., Bajwa, A., Specter, M., & Kagal, L.(2018, October). Explaining Explanations: An Overview ofInterpretability of Machine Learning. In 2018 IEEE 5th InternationalConference on Data Science and Advanced Analytics (DSAA) (pp. 8089). IEEE.Molnar, C. (2018). Interpretable machine learning: A guide for makingblack box models explainable. E-book at https://christophm. github.io/interpretable-ml-book/ , version dated, 10.Machine Learning Interpretability Techniques in Credit Risk Modeling 32

Appendix

AppendixPartial Dependence Plots The partial dependence function is defined as:integrate over all xC𝑥𝑥𝑆𝑆feature(s) of interest𝑥𝑥𝐶𝐶other features used in the modelAccumulated Local EffectsConditional distributionwhere:Differential/changein PDMachine Learning Interpretability Techniques in Credit Risk Modeling 34

AppendixH-statistic 1-dim PDPsTwo Way:H-stat(ranges from 0 to 1) 2-dim PDPAll Way:1-dim PDPs(n-1) dim PDPpredictionMachine Learning Interpretability Techniques in Credit Risk Modeling 35

Craig PetersSr Dir-Resrch Model ValidationCraig.Peters@moodys.com

2019 Moody’s Corporation, Moody’s Investors Service, Inc., Moody’s Analytics, Inc. and/or their licensors and affiliates(collectively, “MOODY’S”). All rights reserved.MOODY’S or any of its directors, officers, employees, agents, representatives, licensors or suppliers, arising from or inconnection with the information contained herein or the use of or inability to use any such information.CREDIT RATINGS ISSUED BY MOODY'S INVESTORS SERVICE, INC. AND ITS RATINGS AFFILIATES (“MIS”) AREMOODY’S CURRENT OPINIONS OF THE RELATIVE FUTURE CREDIT RISK OF ENTITIES, CREDITCOMMITMENTS, OR DEBT OR DEBT-LIKE SECURITIES, AND MOODY’S PUBLICATIONS MAY INCLUDE MOODY’SCURRENT OPINIONS OF THE RELATIVE FUTURE CREDIT RISK OF ENTITIES, CREDIT COMMITMENTS, ORDEBT OR DEBT-LIKE SECURITIES. MOODY’S DEFINES CREDIT RISK AS THE RISK THAT AN ENTITY MAY NOTMEET ITS CONTRACTUAL, FINANCIAL OBLIGATIONS AS THEY COME DUE AND ANY ESTIMATED FINANCIALLOSS IN THE EVENT OF DEFAULT. CREDIT RATINGS DO NOT ADDRESS ANY OTHER RISK, INCLUDING BUTNOT LIMITED TO: LIQUIDITY RISK, MARKET VALUE RISK, OR PRICE VOLATILITY. CREDIT RATINGS ANDMOODY’S OPINIONS INCLUDED IN MOODY’S PUBLICATIONS ARE NOT STATEMENTS OF CURRENT ORHISTORICAL FACT. MOODY’S PUBLICATIONS MAY ALSO INCLUDE QUANTITATIVE MODEL-BASED ESTIMATESOF CREDIT RISK AND RELATED OPINIONS OR COMMENTARY PUBLISHED BY MOODY’S ANALYTICS, INC.CREDIT RATINGS AND MOODY’S PUBLICATIONS DO NOT CONSTITUTE OR PROVIDE INVESTMENT ORFINANCIAL ADVICE, AND CREDIT RATINGS AND MOODY’S PUBLICATIONS ARE NOT AND DO NOT PROVIDERECOMMENDATIONS TO PURCHASE, SELL, OR HOLD PARTICULAR SECURITIES. NEITHER CREDIT RATINGSNOR MOODY’S PUBLICATIONS COMMENT ON THE SUITABILITY OF AN INVESTMENT FOR ANY PARTICULARINVESTOR. MOODY’S ISSUES ITS CREDIT RATINGS AND PUBLISHES MOODY’S PUBLICATIONS WITH THEEXPECTATION AND UNDERSTANDING THAT EACH INVESTOR WILL, WITH DUE CARE, MAKE ITS OWN STUDYAND EVALUATION OF EACH SECURITY THAT IS UNDER CONSIDERATION FOR PURCHASE, HOLDING, ORSALE.NO WARRANTY, EXPRESS OR IMPLIED, AS TO THE ACCURACY, TIMELINESS, COMPLETENESS,MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OF ANY SUCH RATING OR OTHEROPINION OR INFORMATION IS GIVEN OR MADE BY MOODY’S IN ANY FORM OR MANNER WHATSOEVER.MOODY’S CREDIT RATINGS AND MOODY’S PUBLICATIONS ARE NOT INTENDED FOR USE BY RETAILINVESTORS AND IT WOULD BE RECKLESS AND INAPPROPRIATE FOR RETAIL INVESTORS TO USE MOODY’SCREDIT RATINGS OR MOODY’S PUBLICATIONS WHEN MAKING ANINVESTMENT DECISION. IF IN DOUBT YOU SHOULD CONTACT YOUR FINANCIAL OR OTHER PROFESSIONALADVISER.ALL INFORMATION CONTAINED HEREIN IS PROTECTED BY LAW, INCLUDING BUT NOT LIMITED TO,COPYRIGHT LAW, AND NONE OF SUCH INFORMATION MAY BE COPIED OR OTHERWISE REPRODUCED,REPACKAGED, FURTHER TRANSMITTED, TRANSFERRED, DISSEMINATED, REDISTRIBUTED OR RESOLD, ORSTORED FOR SUBSEQUENT USE FOR ANY SUCH PURPOSE, IN WHOLE OR IN PART, IN ANY FORM ORMANNER OR BY ANY MEANS WHATSOEVER, BY ANY PERSON WITHOUT MOODY’S PRIOR WRITTENCONSENT.All information contained herein is obtained by MOODY’S from sources believed by it to be accurate and reliable.Because of the possibility of human or mechanical error as well as other factors, however, all information containedherein is provided “AS IS” without warranty of any kind. MOODY'S adopts all necessary measures so that the informationit uses in assigning a credit rating is of sufficient quality and from sources MOODY'S considers to be reliable including,when appropriate, independent third-party sources. However, MOODY’S is not an auditor and cannot in every instanceindependently verify or validate information received in the rating process or in preparing the Moody’s publications.To the extent permitted by law, MOODY’S and its directors, officers, employees, agents, representatives, licensors andsuppliers disclaim liability to any person or entity for any indirect, special, consequential, or incidental losses or damageswhatsoever arising from or in connection with the information contained herein or the use of or inability to use any suchinformation, even if MOODY’S or any of its directors, officers, employees, agents, representatives, licensors or suppliersis advised in advance of the possibility of such losses or damages, including but not limited to: (a) any loss of present orprospective profits or (b) any loss or damage arising where the relevant financial instrument is not the subject of aparticular credit rating assigned by MOODY’S.Moody’s Investors Service, Inc., a wholly-owned credit rating agency subsidiary of Moody’s Corporation (“MCO”), herebydiscloses that most issuers of debt securities (including corporate and municipal bonds, debentures, notes andcommercial paper) and preferred stock rated by Moody’s Investors Service, Inc. have, prior to assignment of any rating,agreed to pay to Moody’s Investors Service, Inc. for appraisal and rating services rendered by it fees ranging from 1,500to approximately 2,500,000. MCO and MIS also maintain policies and procedures to address the independence ofMIS’s ratings and rating processes. Information regarding certain affiliations that may exist between directors of MCOand rated entities, and between entities who hold ratings from MIS and have also publicly reported to the SEC anownership interest in MCO of more than 5%, is posted annually at www.moodys.com under the heading “InvestorRelations — Corporate Governance — Director and Shareholder Affiliation Policy.”Additional terms for Australia only: Any publication into Australia of this document is pursuant to the Australian FinancialServices License of MOODY’S affiliate, Moody’s Investors Service Pty Limited ABN 61 003 399 657AFSL 336969 and/orMoody’s Analytics Australia Pty Ltd ABN 94 105 136 972 AFSL 383569 (as applicable). This document is intended to beprovided only to “wholesale clients” within the meaning of section 761G of the Corporations Act 2001. By continuing toaccess this document from within Australia, you represent to MOODY’S that you are, or are accessing the document as arepresentative of, a “wholesale client” and that neither you nor the entity you represent will directly or indirectlydisseminate this document or its contents to “retail clients” within the meaning of section 761G of the Corporations Act2001. MOODY’S credit rating is an opinion as to the creditworthiness of a debt obligation of the issuer, not on the equitysecurities of the issuer or any form of security that is available to retail investors. It would be reckless and inappropriatefor retail investors to use MOODY’S credit ratings or publications when making an investment decision. If in doubt youshould contact your financial or other professional adviser.Additional terms for Japan only: Moody's Japan K.K. (“MJKK”) is a wholly-owned credit rating agency subsidiary ofMoody's Group Japan G.K., which is wholly-owned by Moody’s Overseas Holdings Inc., a wholly-owned subsidiary ofMCO. Moody’s SF Japan K.K. (“MSFJ”) is a wholly-owned credit rating agency subsidiary of MJKK. MSFJ is not aNationally Recognized Statistical Rating Organization (“NRSRO”). Therefore, credit ratings assigned by MSFJ are NonNRSRO Credit Ratings. Non-NRSRO Credit Ratings are assigned by an entity that is not a NRSRO and, consequently,the rated obligation will not qualify for certain types of treatment under U.S. laws. MJKK and MSFJ are credit ratingagencies registered with theJapan Financial Services Agency and their registration numbers are FSA Commissioner (Ratings) No. 2and 3 respectively.MJKK or MSFJ (as applicable) hereby disclose that most issuers of debt securities (including corporate and municipalbonds, debentures, notes and commercial paper) and preferred stock rated by MJKK or MSFJ (as applicable) have, priorto assignment of any rating, agreed to pay to MJKK or MSFJ (as applicable) for appraisal and rating services renderedby it fees ranging from JPY200,000 to approximately JPY350,000,000.MJKK and MSFJ also maintain policies and procedures to address Japanese regulatory requirements.To the extent permitted by law, MOODY’S and its directors, officers, employees, agents, representatives, licensors andsuppliers disclaim liability for any direct or compensatory losses or damages caused to any person or entity, including butnot limited to by any negligence (but excluding fraud, willful misconduct or any other type of liability that, for theavoidance of doubt, by law cannot be excluded) on the part of, or any contingency within or beyond the control of,Machine Learning Interpretability Techniques in Credit Risk Modeling 38

Model-Agnostic methods can be applied to any model enabling a broader range of methodologies. Interpretability techniques can help make today's black boxes . Predictive learning via rule ensembles. The Annals of Applied Statistics, 2 (3), 916-954. Local interpretable model-agnostic explanations (LIME) Ribeiro, M. T., Singh, S., & Guestrin .