Transcription
Decision TreesNipun BatraJan 8, 2019
Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo2
Training CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esYesNo2
Training CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esNoYesYesYesYesYesNo2
Discrete Output :ClassificationTraining CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esNoYesYesYesYesYesNo2
Training DataHome #Square footageAgePrice (1000 USD)D110002030D2200010NoD33000540D4124010503
Training DataHome #Square footageAgePrice (1000 USD)D110002030D2200010NoD33000540D412401050Input3
Training DataHome #Square footageAgePrice (1000 10503
Continuous Output :RegressionTraining DataHome #Square footageAgePrice (1000 10503
Train, Validation, TestData4
Train, Validation, TestTrainTest5
Train, Validation, TestTrainTestMachine learningalgorithm5
Train, Validation, TestTrainMachine learningalgorithmTestModel5
Train, Validation, TestTrainMachine learningalgorithmTestModel5
Train, Validation, TestTrainMachine learningalgorithmTestModelEstimations5
Train, Validation, TestTrainMachine asure5
Train, Validation, TestTrainMachine learningalgorithmQ: Any problem with the model?TestModelEstimationsPerformanceMeasure5
Train, Validation, TestTrainMachine learningalgorithmQ: Any problem with the model?Overfitting: Model highly tuned totrain data, not generalizable to testTestModelEstimationsPerformanceMeasure5
Train, Validation, TestTrainTestBasic idea: Choose a model (parameters) which optimizesperformance on validation set6
Train, Validation, TestTrainValidationTestBasic idea: Choose a model (parameters) which optimizesperformance on validation set6
Train, Validation, TestTrainValidationTestMachine learningalgorithmBasic idea: Choose a model (parameters) which optimizesperformance on validation set6
Train, Validation, TestTrainMachine learningalgorithmValidationTestModelBasic idea: Choose a model (parameters) which optimizesperformance on validation set6
Need For Interpretability7
Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo8
Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo8
Decisions - Will I Play alNoYesExample borrowed from Tom Mitchell’s text bookWindStrongNoWeakYes9
Hard to Learn Optimal Decision Tree10
Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”11
Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”Image source: analyticsvidhya11
Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”Greedy! OptimalImage source: analyticsvidhya11
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /-
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v3. If Examples v is empty: add leaf with label mostcommon value of Target Attribute
Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v3. If Examples v is empty: add leaf with label mostcommon value of Target Attribute4. Else: ID3 (Examples v, Target attribute, Attributes - {A})
EntropyEntropy: Statistical measure to characterize the(im)purity of examples13
EntropyEntropy: Statistical measure to characterize the(im)purity of YesNo13
EntropyEntropy: Statistical measure to characterize the(im)purity of examplesPlayTennis5 No, 9 YesNoNoYesYesYesNoYesNoYesYesYesYesYesNo13
EntropyEntropy: Statistical measure to characterize the(im)purity of YesNo13
EntropyEntropy: Statistical measure to characterize the(im)purity of examples13
EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.9413
EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.9413
EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.94Avg. # of bits to transmit13
Information GainInformation Gain: Reduction in entropy14
Information GainInformation Gain: Reduction in entropyBy partitioning examples (S) on attribute A14
Information GainBy partitioning examples (S) on attribute A14
Information Gain14
Information Gain14
Information trongYesWeakYesStrongNo14
Information trongYesWeakYesStrongNo A Wind14
Information GainWindPlayTennisWeakNoStrongNo WeakYes sStrongYesStrongYesWeakYesStrongNoA WindValues (Wind) Weak, Strong14
Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes ngYesWeakYesStrongNoA WindValues (Wind) Weak, StrongS [9 , 5-]14
Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes ngYesWeakYesStrongNo A WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]14
Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes StrongNo StrongYes gNoA WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]SStrong [3 , 3-]14
Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes StrongNo StrongYes WeakNo WeakYesWeakYesStrongYesStrongYesWeakYesStrongNoA WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]SStrong [3 , 3-]Gain (S, Wind) Entropy (S) (8/14)*Entropy (SWeak) (6/14)*Entropy(SStrong)14
Information trongYesWeakYesStrongNoA Wind Values (Wind) Weak, Strong S [9 , 5-] SWeak [6 , 2-] SStrong [3 , 3-] Gain (S, Wind) Entropy (S) (8/14)*Entropy (SWeak) (6/14)*Entropy(SStrong) 0.048 14
Information unnyYesOvercastYesOvercastYesRainNo15
Information unnyYesOvercastYesOvercastYesRainNo A Outlook15
Information GainOutlookPlayTennis SunnyNo sRainNoA OutlookValues (Outlook) Sunny,Overcast, Rain15
Information GainOutlookPlayTennis SunnyNo sRainNo A OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]15
Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes rcastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]15
Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes esRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]15
Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes esRainNo A OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]15
Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes SunnyNo SunnyYes RainYesSunnyYesOvercastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)15
Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes SunnyNo SunnyYes RainYesSunnyYesOvercastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)(5/14)*Entropy(SRain)15
Information unnyYesOvercastYesOvercastYesRainNoA Outlook Values (Outlook) Sunny,Overcast, Rain S [9 , 5-] SSunny [2 , 3-] SOvercast [4 , 0-] SRain [3 , 2-] Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)(5/14)*Entropy(SRain) 0.246 15
Information rature16
Worked Out Example9 , 5OutlookSunnyOvercastRain2 , 3-4 , 0-3 , 2-Example borrowed from Tom Mitchell’s text book17
Worked Out Example9 , 5OutlookSunnyOvercast2 , 3-Rain3 , 2YesExample borrowed from Tom Mitchell’s text book18
Worked Out Example9 , 5OutlookRainWindPlayTennisHot3 unnyOutlookTemp2 , 3-SunnyExample borrowed from Tom Mitchell’s text bookOvercastHumidity19
Worked Out Example9 , 5OutlookRainWindPlayTennisHot3 unnyOutlookTemp2 , 3-SunnyS’ S outlook is sunnyEntropy (S’) -(2/5) log2(2/5) (3/5)log2(3/5)Example borrowed from Tom Mitchell’s text bookOvercastHumidity20
Worked Out alNoYesExample borrowed from Tom Mitchell’s text bookWindStrongNoWeakYes21
Example borrowed from Tom Mitchell's text book 2 Day Outlook Temp Humidity Wind PlayTennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes . Machine learning algorithm Estimations. Train, Validation, Test 5 Train Test Model Machine learning algorithm Estimations Performance Measure. Train, Validation .