Decision Trees - Nipun Batra

Transcription

Decision TreesNipun BatraJan 8, 2019

Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo2

Training CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esYesNo2

Training CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esNoYesYesYesYesYesNo2

Discrete Output :ClassificationTraining CoolCoolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text esNoYesYesYesYesYesNo2

Training DataHome #Square footageAgePrice (1000 USD)D110002030D2200010NoD33000540D4124010503

Training DataHome #Square footageAgePrice (1000 USD)D110002030D2200010NoD33000540D412401050Input3

Training DataHome #Square footageAgePrice (1000 10503

Continuous Output :RegressionTraining DataHome #Square footageAgePrice (1000 10503

Train, Validation, TestData4

Train, Validation, TestTrainTest5

Train, Validation, TestTrainTestMachine learningalgorithm5

Train, Validation, TestTrainMachine learningalgorithmTestModel5

Train, Validation, TestTrainMachine learningalgorithmTestModel5

Train, Validation, TestTrainMachine learningalgorithmTestModelEstimations5

Train, Validation, TestTrainMachine asure5

Train, Validation, TestTrainMachine learningalgorithmQ: Any problem with the model?TestModelEstimationsPerformanceMeasure5

Train, Validation, TestTrainMachine learningalgorithmQ: Any problem with the model?Overfitting: Model highly tuned totrain data, not generalizable to testTestModelEstimationsPerformanceMeasure5

Train, Validation, TestTrainTestBasic idea: Choose a model (parameters) which optimizesperformance on validation set6

Train, Validation, TestTrainValidationTestBasic idea: Choose a model (parameters) which optimizesperformance on validation set6

Train, Validation, TestTrainValidationTestMachine learningalgorithmBasic idea: Choose a model (parameters) which optimizesperformance on validation set6

Train, Validation, TestTrainMachine learningalgorithmValidationTestModelBasic idea: Choose a model (parameters) which optimizesperformance on validation set6

Need For Interpretability7

Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo8

Training oolMildCoolMildMildMildHotMildExample borrowed from Tom Mitchell’s text ngPlayTennisNoNoYesYesYesNoYesNoYesYesYesYesYesNo8

Decisions - Will I Play alNoYesExample borrowed from Tom Mitchell’s text bookWindStrongNoWeakYes9

Hard to Learn Optimal Decision Tree10

Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”11

Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”Image source: analyticsvidhya11

Greedy AlgorithmIntuition: At each level, choose an attribute that gives“biggest estimated performance gain”Greedy! OptimalImage source: analyticsvidhya11

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /-

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v3. If Examples v is empty: add leaf with label mostcommon value of Target Attribute

Greedy AlgorithmID3 (Examples, Target Attribute, Attributes)1. Create a root node for tree2. If all examples are /-, return root with label /3. If attributes empty, return root with most common value ofTarget Attribute in Examples4. Begin1. A - attribute from Attributes which best classifiesExamples2. Root - A3. For each value (v) of A1. Add new tree branch : A v2. Examples v : subset of examples that A v3. If Examples v is empty: add leaf with label mostcommon value of Target Attribute4. Else: ID3 (Examples v, Target attribute, Attributes - {A})

EntropyEntropy: Statistical measure to characterize the(im)purity of examples13

EntropyEntropy: Statistical measure to characterize the(im)purity of YesNo13

EntropyEntropy: Statistical measure to characterize the(im)purity of examplesPlayTennis5 No, 9 YesNoNoYesYesYesNoYesNoYesYesYesYesYesNo13

EntropyEntropy: Statistical measure to characterize the(im)purity of YesNo13

EntropyEntropy: Statistical measure to characterize the(im)purity of examples13

EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.9413

EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.9413

EntropyEntropy: Statistical measure to characterize the(im)purity of examplesEntropy - pNo log2 pNo - pYes log2 pYes -(5/14) log2(5/14) - (9/14)log2(9/14) 0.94Avg. # of bits to transmit13

Information GainInformation Gain: Reduction in entropy14

Information GainInformation Gain: Reduction in entropyBy partitioning examples (S) on attribute A14

Information GainBy partitioning examples (S) on attribute A14

Information Gain14

Information Gain14

Information trongYesWeakYesStrongNo14

Information trongYesWeakYesStrongNo A Wind14

Information GainWindPlayTennisWeakNoStrongNo WeakYes sStrongYesStrongYesWeakYesStrongNoA WindValues (Wind) Weak, Strong14

Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes ngYesWeakYesStrongNoA WindValues (Wind) Weak, StrongS [9 , 5-]14

Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes ngYesWeakYesStrongNo A WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]14

Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes StrongNo StrongYes gNoA WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]SStrong [3 , 3-]14

Information GainWindPlayTennisWeakNoStrongNo WeakYes WeakYesWeakYes StrongNo StrongYes WeakNo WeakYesWeakYesStrongYesStrongYesWeakYesStrongNoA WindValues (Wind) Weak, StrongS [9 , 5-]SWeak [6 , 2-]SStrong [3 , 3-]Gain (S, Wind) Entropy (S) (8/14)*Entropy (SWeak) (6/14)*Entropy(SStrong)14

Information trongYesWeakYesStrongNoA Wind Values (Wind) Weak, Strong S [9 , 5-] SWeak [6 , 2-] SStrong [3 , 3-] Gain (S, Wind) Entropy (S) (8/14)*Entropy (SWeak) (6/14)*Entropy(SStrong) 0.048 14

Information unnyYesOvercastYesOvercastYesRainNo15

Information unnyYesOvercastYesOvercastYesRainNo A Outlook15

Information GainOutlookPlayTennis SunnyNo sRainNoA OutlookValues (Outlook) Sunny,Overcast, Rain15

Information GainOutlookPlayTennis SunnyNo sRainNo A OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]15

Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes rcastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]15

Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes esRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]15

Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes esRainNo A OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]15

Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes SunnyNo SunnyYes RainYesSunnyYesOvercastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)15

Information GainOutlookPlayTennis SunnyNo SunnyNoOvercastYesRainYes RainYes RainNoOvercastYes SunnyNo SunnyYes RainYesSunnyYesOvercastYesOvercastYesRainNoA OutlookValues (Outlook) Sunny,Overcast, RainS [9 , 5-]SSunny [2 , 3-]SOvercast [4 , 0-]SRain [3 , 2-]Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)(5/14)*Entropy(SRain)15

Information unnyYesOvercastYesOvercastYesRainNoA Outlook Values (Outlook) Sunny,Overcast, Rain S [9 , 5-] SSunny [2 , 3-] SOvercast [4 , 0-] SRain [3 , 2-] Gain (S, Outlook) Entropy (S) (5/14)*Entropy (SSunny) (4/14)*Entropy(SOvercast)(5/14)*Entropy(SRain) 0.246 15

Information rature16

Worked Out Example9 , 5OutlookSunnyOvercastRain2 , 3-4 , 0-3 , 2-Example borrowed from Tom Mitchell’s text book17

Worked Out Example9 , 5OutlookSunnyOvercast2 , 3-Rain3 , 2YesExample borrowed from Tom Mitchell’s text book18

Worked Out Example9 , 5OutlookRainWindPlayTennisHot3 unnyOutlookTemp2 , 3-SunnyExample borrowed from Tom Mitchell’s text bookOvercastHumidity19

Worked Out Example9 , 5OutlookRainWindPlayTennisHot3 unnyOutlookTemp2 , 3-SunnyS’ S outlook is sunnyEntropy (S’) -(2/5) log2(2/5) (3/5)log2(3/5)Example borrowed from Tom Mitchell’s text bookOvercastHumidity20

Worked Out alNoYesExample borrowed from Tom Mitchell’s text bookWindStrongNoWeakYes21

Example borrowed from Tom Mitchell's text book 2 Day Outlook Temp Humidity Wind PlayTennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes . Machine learning algorithm Estimations. Train, Validation, Test 5 Train Test Model Machine learning algorithm Estimations Performance Measure. Train, Validation .