Introduction To Pattern Recognition - Cs.bilkent.edu.tr

Transcription

Introduction to Pattern RecognitionSelim AksoyDepartment of Computer EngineeringBilkent Universitysaksoy@cs.bilkent.edu.trCS 551, Fall 2019CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)1 / 38

Human PerceptionIHumans have developed highly sophisticated skills forsensing their environment and taking actions according towhat they observe, e.g.,IIIIIrecognizing a face,understanding spoken words,reading handwriting,distinguishing fresh food from its smell.We would like to give similar capabilities to machines.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)2 / 38

What is Pattern Recognition?IA pattern is an entity, vaguely defined, that could be given aname, e.g.,IIIIIIIfingerprint image,handwritten word,human face,speech signal,DNA sequence,.Pattern recognition is the study of how machines canIIICS 551, Fall 2019observe the environment,learn to distinguish patterns of interest,make sound and reasonable decisions about the categoriesof the patterns.c 2019, Selim Aksoy (Bilkent University)3 / 38

Human and Machine PerceptionIWe are often influenced by the knowledge of how patternsare modeled and recognized in nature when we developpattern recognition algorithms.IResearch on machine perception also helps us gain deeperunderstanding and appreciation for pattern recognitionsystems in nature.IYet, we also apply many techniques that are purelynumerical and do not have any correspondence in naturalsystems.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)4 / 38

Pattern Recognition ApplicationsFigure 1: English handwriting recognition.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)5 / 38

Pattern Recognition ApplicationsFigure 2: Chinese handwriting recognition.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)6 / 38

Pattern Recognition ApplicationsFigure 3: Biometric recognition.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)7 / 38

Pattern Recognition ApplicationsFigure 4: Fingerprint recognition.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)8 / 38

Pattern Recognition ApplicationsFigure 5: Autonomous navigation.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)9 / 38

Pattern Recognition ApplicationsFigure 6: Cancer detection and grading using microscopic tissue data. (left)A whole slide image with 75568 74896 pixels. (right) A region of interest with7440 8260 pixels.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)10 / 38

Pattern Recognition ApplicationsFigure 7: Land cover classification using satellite data.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)11 / 38

Pattern Recognition ApplicationsFigure 8: Building and building group recognition using satellite data.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)12 / 38

Pattern Recognition ApplicationsFigure 9: License plate recognition: US license plates.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)13 / 38

Pattern Recognition ApplicationsFigure 10: Clustering of microarray data.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)14 / 38

An ExampleIProblem: Sorting incomingfish on a conveyor beltaccording to species.IAssume that we have onlytwo kinds of fish:IIsea bass,salmon.Figure 11: Picture taken from acamera.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)15 / 38

An Example: Decision ProcessIWhat kind of information can distinguish one species fromthe other?IIWhat can cause problems during sensing?IIlength, width, weight, number and shape of fins, tail shape,etc.lighting conditions, position of fish on the conveyor belt,camera noise, etc.What are the steps in the process?ICS 551, Fall 2019capture image isolate fish take measurements makedecisionc 2019, Selim Aksoy (Bilkent University)16 / 38

An Example: Selecting FeaturesIAssume a fisherman told us that a sea bass is generallylonger than a salmon.IWe can use length as a feature and decide between seabass and salmon according to a threshold on length.IHow can we choose this threshold?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)17 / 38

An Example: Selecting FeaturesFigure 12: Histograms of the length feature for two types of fish in trainingsamples. How can we choose the threshold l to make a reliable decision?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)18 / 38

An Example: Selecting FeaturesIEven though sea bass is longer than salmon on theaverage, there are many examples of fish where thisobservation does not hold.ITry another feature: average lightness of the fish scales.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)19 / 38

An Example: Selecting FeaturesFigure 13: Histograms of the lightness feature for two types of fish in trainingsamples. It looks easier to choose the threshold x but we still cannot make aperfect decision.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)20 / 38

An Example: Cost of ErrorIWe should also consider costs of different errors we makein our decisions.IFor example, if the fish packing company knows that:IIICustomers who buy salmon will object vigorously if they seesea bass in their cans.Customers who buy sea bass will not be unhappy if theyoccasionally see some expensive salmon in their cans.How does this knowledge affect our decision?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)21 / 38

An Example: Multiple FeaturesIAssume we also observed that sea bass are typically widerthan salmon.IWe can use two features in our decision:IIIlightness: x1width: x2Each fish image is now represented as a point (featurevector )!x1x x2in a two-dimensional feature space.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)22 / 38

An Example: Multiple FeaturesFigure 14: Scatter plot of lightness and width features for training samples.We can draw a decision boundary to divide the feature space into tworegions. Does it look better than using only lightness?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)23 / 38

An Example: Multiple FeaturesIDoes adding more features always improve the results?IIIIIAvoid unreliable features.Be careful about correlations with existing features.Be careful about measurement costs.Be careful about noise in the measurements.Is there some curse for working in very high dimensions?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)24 / 38

An Example: Decision BoundariesICan we do better with another decision rule?IMore complex models result in more complex boundaries.Figure 15: We may distinguish training samples perfectly but how canwe predict how well we can generalize to unknown samples?CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)25 / 38

An Example: Decision BoundariesIHow can we manage the tradeoff between complexity ofdecision rules and their performance to unknown samples?Figure 16: Different criteria lead to different decision boundaries.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)26 / 38

More on Complexity1 0 101Figure 17: Regression example: plot of 10 sample points for the inputvariable x along with the corresponding target variable t. Green curve is thetrue function that generated the data.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)27 / 38

More on Complexity 1 00 1 1010(a) 0’th order polynomial 1 1 1(c) 3’rd order polynomial 1011(b) 1’st order polynomial00 101(d) 9’th order polynomialFigure 18: Polynomial curve fitting: plots of polynomials having variousorders, shown as red curves, fitted to the set of 10 sample points.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)28 / 38

More on Complexity 1 00 1 101(a) 15 sample points 101(b) 100 sample pointsFigure 19: Polynomial curve fitting: plots of 9’th order polynomials fitted to15 and 100 sample points.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)29 / 38

Pattern Recognition SystemsPhysical environmentData acquisition/sensingTraining dataPre processingPre processingFeature extractionFeature ModelModel learning/estimationPost processingDecisionFigure 20: Object/process diagram of a pattern recognition system.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)30 / 38

Pattern Recognition SystemsIData acquisition and sensing:IIIPre-processing:IIIMeasurements of physical variables.Important issues: bandwidth, resolution, sensitivity,distortion, SNR, latency, etc.Removal of noise in data.Isolation of patterns of interest from the background.Feature extraction:ICS 551, Fall 2019Finding a new representation in terms of features.c 2019, Selim Aksoy (Bilkent University)31 / 38

Pattern Recognition SystemsIModel learning and estimation:IIClassification:IILearning a mapping between features and pattern groupsand categories.Using features and learned models to assign a pattern to acategory.Post-processing:IIICS 551, Fall 2019Evaluation of confidence in decisions.Exploitation of context to improve performance.Combination of experts.c 2019, Selim Aksoy (Bilkent University)32 / 38

The Design sifierEvaluateclassifierFigure 21: The design cycle.IData collection:IICS 551, Fall 2019Collecting training and testing data.How can we know when we have adequately large andrepresentative set of samples?c 2019, Selim Aksoy (Bilkent University)33 / 38

The Design CycleIFeature selection:IIIDomain dependence and prior information.Computational cost and feasibility.Discriminative features.IIIICS 551, Fall 2019Similar values for similar patterns.Different values for different patterns.Invariant features with respect to translation, rotation andscale.Robust features with respect to occlusion, distortion,deformation, and variations in environment.c 2019, Selim Aksoy (Bilkent University)34 / 38

The Design CycleIModel selection:IIIIIIICS 551, Fall 2019Domain dependence and prior information.Definition of design criteria.Parametric vs. non-parametric models.Handling of missing features.Computational complexity.Types of models: templates, decision-theoretic or statistical,syntactic or structural, neural, and hybrid.How can we know how close we are to the true modelunderlying the patterns?c 2019, Selim Aksoy (Bilkent University)35 / 38

The Design CycleITraining:IIIICS 551, Fall 2019How can we learn the rule from data?Supervised learning: a teacher provides a category label orcost for each pattern in the training set.Unsupervised learning: the system forms clusters or naturalgroupings of the input patterns.Reinforcement learning: no desired category is given but theteacher provides feedback to the system such as thedecision is right or wrong.c 2019, Selim Aksoy (Bilkent University)36 / 38

The Design CycleIEvaluation:IIICS 551, Fall 2019How can we estimate the performance with trainingsamples?How can we predict the performance with future data?Problems of overfitting and generalization.c 2019, Selim Aksoy (Bilkent University)37 / 38

SummaryIPattern recognition techniques find applications in manyareas: machine learning, statistics, mathematics, computerscience, biology, etc.IThere are many sub-problems in the design process.IMany of these problems can indeed be solved.IMore complex learning, searching and optimizationalgorithms are developed with advances in computertechnology.IThere remain many fascinating unsolved problems.CS 551, Fall 2019c 2019, Selim Aksoy (Bilkent University)38 / 38

What is Pattern Recognition? I A pattern is an entity, vaguely defined, that could be given a name, e.g., I fingerprint image, I handwritten word, I human face, I speech signal, I DNA sequence, I::: I Pattern recognition is the study of how machines can I observe the environment, I learn to distinguish patterns of in