Transcription
Practical Machine Learning inRIntroductionLars Kotthoff12larsko@uwyo.edu12with slides from Bernd Bischl and Michel Langslides available at http://www.cs.uwyo.edu/ larsko/ml-fac1
What is Machine Learning? “gives computes the ability to learn without being explicitlyprogrammed” (Wikipedia)2
What is Machine Learning? “gives computes the ability to learn without being explicitlyprogrammed” (Wikipedia) “A computer program is said to learn from experience E withrespect to some class of tasks T and performance measure Pif its performance at tasks in T , as measured by P , improveswith experience E.” (Tom Mitchell)2
Examples3
Examples4
predicting-machine-learning-tutorial/5
ing-nba-divisions-by-clustering/6
Supervised Learning learn the relationship between input x and output y training data with labels available – y known for given x can see this as function approximation – find an f such thaty f (x)7
Supervised Learning x are features or attributes y is the ground truth denote predictions f (x) ŷ loss function L(y, ŷ) measures how good predictions are, e.g.L(y, ŷ) (y ŷ)2 want to minimize loss given training data Xtrain {(xi , yi )}n :arg minn L(yi , ŷi )i 18
Supervised Learning want to learn a general function that is predictive on new data second set Xtest that is not used in training to testgeneralization performance:n L(yi , ŷi )i 1 usually full data set X is split into non-overlapping train andtest sets:Xtrain Xtest XXtrain Xtest 9
Supervised Classification0.8 0.6 class b car0.4truck0.2 0.023456aGoal: Predict a class (discrete quantity), or membershipprobabilities10
Supervised Regression7.5 5.0 b 2.5 0.0 2.5 3 2 10123aGoal: Predict a continuous quantity11
Unsupervised Learning no ground truth y available determine group membership or assign labels loss function measures properties of groups, e.g. homogeneitywrt. features still want to minimize loss given training data and generalize12
Unsupervised Clustering2.5 2.0 1.5 b 1.0 0.5 0.0246aGoal: Group data by similarity, or estimate membershipprobabilities13
In this Course classification regression clustering data preprocessing (missing values, dimensionality reduction) performance evaluation parameter tuning14
Not in this Course R tutorial details on particular methods deep learning time series Big Data15
What you’ll need16
Install wnload/17
Install mlr on the R console:install.packages(”mlr”) or see s/InstallPackagesRStudio.html extensive tutorial available: 18
Format meetings roughly every week half lecture, half practical exercises happy to discuss specific problems19
What is Machine Learning? “gives computes the ability to learn without being explicitly programmed” (Wikipedia) “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as mea