Essence Of Machine Learning (and Deep Learning) - GitHub

Transcription

Essence of Machine Learning(and Deep Learning)Hoa M. LeData Science Lab, HUSThoamle.github.io1

Examples https://www.youtube.com/watch?v BmkA1ZsG2P4 -part-1/2

Machine Learning is about a computer program (machine) learns to do a task(problem) from experience (data) learning improved performance with more experience- Tom Mitchell predictive modelling with sample data "heurestics" & statistical modellingnote 1: “heurestic” as in “intuitive, but not (yet!) rigorously proven by mathematicaltools at some extend”note 2: predictive modelling can also be in the form of rule-based systems, models in physics, etc3

BUILDA MACHINE LEARNING SOLUTIONthe Pipeline4

Đặt vấn mentalDesignĐánh giá mô hìnhThu thập dữ liệuDataacquisitionAssessment(Performance)Xây dựng mô hìnhWhat MLmostlyaboutModelling(Machine)(Experience)Data preprocess5

Đặt vấn đềQuestion/Giải thích/phân tíchkết quảHypothesisInterpretationThiết kế thử nghiệmExperimentalDesignĐánh giá mô hìnhLấy mẫuDatasamplingAssessmentXây dựng mô hìnhWhat MLmostlyaboutModellingTiền xử lý dữ liệuData preprocess6

Đặt vấn đềQuestion/HypothesisQ.a. What are there in an abitraryphoto?ExperimentalInterpretationDesignQ.b. What is there in an abitrary photo?Q.c. Is there any puppy an abitrary photo?Assessmentcatflowerdogjetgroundgrass Data acquisitionOther questions:- Where are the puppies in a photo?Data pre-process- How confidentModellingcan I assure that thereis a cat a photo?(ETL)- For what reasons can I know that thereis a cat in a photo?7

Question/HypothesisInterpretationMachine Learningi.e.Automatic data-drivenpredictive modelsThiết kế thử nghiệmExperimentalDesign(i.e. planning)Data? Acquisition?keywords: data sampling/surveyAssessmentData acquisitionModel? Assessment?keywords: training/testing sets,mean squared errors, precision,recall, ModellingData pre-process(ETL)8

Question/HypothesisInterpretationMachine Learningi.e.Automatic data-drivenpredictive modelsThiết kế thử nghiệmExperimentalDesign(i.e. planning)Data? Acquisition?keywords: data sampling/surveyAssessmentData samplingModel? Assessment?keywords: training/testing sets,evaluation metrics (e.g. meansquared errors, precision, recall)ModellingData pre-process(ETL)9

Avoid as many sampling biases as estion/HypothesisInterpretationData SamplingAssessmentExperimentalDesignRepresentative sample How many photos, categories,photos in each category, ? (If time-series data: eg videos)Sample at which time points? Imbalance class? Selection bias?ModellingLấy mẫuData samplingData pre-process(ETL)10

Which metrics to use depend on which l del AssessmentĐánh giá mô talDesignEvaluation metrics Accuracy Precision, Recall Area Under Curve (AUC) Mean squared errors (MSE) (If hypothesis testing problem) t-statistic, z-statistic, 𝜒 2 statistic, ModellingData samplingData pre-process(ETL)11

If training/testing set split is well designed with sufficientexamples, we might not need to repeat many experiments.Question/HypothesisInterpretationModel AssessmentĐánh giá mô talDesignEvaluation setupEvaluation (i.e.report results) onunseen data Training/testing set split:follows data samplingprinciples Repeat experiment: givesmeasurable confidence to thereported resultsModellingData samplingData pre-process(ETL)12

“All models are wrong, but some are useful.”- Box and Drape, 1987Question/HypothesisModel BuildingInterpretationExperimentalDesignModel a simplification of reality(e.g. map of Hanoi)Keywords: Linear models, Graphical models, Neural networks,SVM, Gaussian Process, Random forest Modelling tip: building model goes from the mostAssessmentData acquisitionsimplified forms to the more complex to describereality more precisely(e.g. building from Linear models to Latent variable models /Deep neural networks)Xây dựng mô hìnhWhat MLmostlyaboutModellingData pre-process(ETL)13

ssedExperimental data Data ETL: extract,transform, load Data standardisation /normalisation Data imputation(if missing values)AssessmentFeature extractionDesign-0.34 -0.46 -0.871.47 -0.24 2.21-1.05 0.02 -1.740.09 -0.58 1.021.63 -0.53 0.061.11 -0.63 -0.93-0.34 -0.46 -0.871.47 -0.24 2.21-1.05 0.02 -1.740.09 -0.58 1.021.63 -0.53 0.061.11 -0.63 -0.93Data acquisition0.09 -0.58 1.021.63 -0.53 0.061.11 -0.63 -0.93. . .Tiền xử lý dữ liệuModellingData pre-process14

Đặt vấn đềQuestion/HypothesisInterpretationThiết kế thử nghiệmExperimentalDesignĐánh giá mô hìnhLấy mẫuAssessmentData samplingXây dựng mô hìnhWhat MLmostlyaboutTiền xử lý dữ liệuModellingData pre-process15

Vấn đề, câu hỏi mớiNEW Question/Giải thích/phân tíchkết quảInterpretationHypothesisThiết kế thử nghiệmExperimentalDesignĐánh giá mô hìnhLấy mẫuAssessmentData samplingXây dựng mô hìnhWhat MLmostlyaboutTiền xử lý dữ liệuModellingData pre-process16

PRINCIPLES OF MODELLINGStatistical reasoning (*)(*) A machine learning algorithm does not necessarily have a probabilistic interpretation, ordeveloped from a statistical framework. Nevertheless, statistical reasoning provides a rigorousmathematical tool for estimation and inference to make optimal decision (e.g. prediction,action) under uncertainty, which is one of the ultimate objectives in ML.17

Đặt vấn rimentalDesignĐánh giá mô hìnhDataacquisitionAssessmentXây dựng mô hìnhModellingTiền xử lý dữ liệuData preprocess18

ML problem: ClassificationQuestionIs there any cat in an abitrary photo?Experience: dataset of {image, label} pairs 𝒟 𝑥𝑛 , 𝑦𝑛Modellingpredict 𝑦𝑛 – cat existence – given arbitrary 𝑥𝑛Cat?Not cat?Prediction𝑦𝑛True, FalseImage𝑥𝑛ℕ400 600 3Assessment𝑁𝑛 1Accuracy 1𝑁𝑛𝕀𝑦𝑛 𝑦𝑛Precision, Recall, F1-scoreArea Under Curve (AUC) onproblemExample models:Logistic regression (linear model)Neural Net with sigmoid output (nonlinear19model)

ML problem: ClassificationQuestionWhat is there in an abitrary photo?Experience: dataset of {image, label} pairs 𝒟 𝑥𝑛 , 𝑦𝑛Modellingpredict 𝑦𝑛 – object identity – given arbitrary ��1,2,3,4,5,6Image𝑥𝑛ℕ400 600 3Assessment𝑁𝑛 1Accuracy 1𝑁𝑛𝕀𝑦𝑛 𝑦𝑛Precision, Recall, F1-scoreArea Under Curve (AUC) cationproblemExample models:Softmax classification (linear model)Neural Net with softmax output (nonlinear20model)

ML problem: RegressionQuestionHow much is the price of a house given Modellingpredict 𝑦𝑛 – house price – given arbitrary 𝑥𝑛Experience: dataset of {(area, location, #rooms), price} pairs 𝒟 𝑥𝑛 , 𝑦𝑛Area100m2Location24.70N183.00E#Rooms3 �𝑛ℝ ℝ2 ℕAssessmentsquared errors 1𝑁𝑛𝑦𝑛 𝑦𝑛𝑁𝑛 1supervisedlearningregressionproblem2Example models/algorithms:Linear regression (linear model)Neural Net with linear output (nonlinear model)21Curve fitting algorithm

ML problem: ClusteringQuestionWhat is the “topic” that a news article is talking about?𝑁𝑛 1Experience: dataset of article content only 𝒟 𝑥𝑛Modellingpredict 𝑧𝑛 – “topic” (cluster) identity – given arbitrary 𝑥𝑛𝐮𝐧supervisedlearningArticle ,2, , 10mean distance to clusters Note: “topic” group/cluster in this context, and is not pre-definedWe will meet the term “topic” again when visiting Topic models1𝑁𝑛𝑥𝑛 𝜇𝑧𝑛x2𝑥𝑛𝑧𝑛 greenExample models/algorithms:k-means algorithmGenerative models: Mixture models, Topic models22

A ML problem can also be: both supervised and unsupervised (semi-supervised) combination of regression and classification subproblems e.g. image localisation23

ModellingPRINCIPLES OFMODELLING1. Model structure - constructs relationships (stochastic and/ordeterministic) between model elements: data, parameters, and hyperparameters.Keywords: graphical model2. Learning principle - defines a framework to estimate unknownparameters (and unobserved i.e. hidden/latent variables)Keywords: Maximum Likelihood criterion, Bayesian inference, others3. RegularisationKeywords: over-fitting, Bayesian inference, othersRelevant keywords: L2-regularisation (Ridge), L1-regularisation (LASSO) ALGORITHM - implements 1 2 3 to train the modelKeywords: (stochastic) gradient descent, Expectation-Maximisation (EM), Variational Inference (VI),sampling-based inference methods4. Model selectionKeywords: cross-validation24

Before we get going 25

26

27

- Tom Mitchell predictive . A machine learning algorithm does not necessarily have a probabilistic interpretation, or developed from a statistical framework. Nevertheless, statistical reasoning provides a rigorous mathematical tool for estimation and inference to make optimal decision (e.g. prediction,