Introduction To Time Series Forecasting - IIT Bombay

Transcription

Introduction to Time Series ForecastingWith Case Studies in NLPA Tutorial at ICON 2019Sandhya Singh & Kevin PatelCF I LTen t erorn d ia nanguageDecember 18, 2019Sandhya and KevinTime Series Forecasting1

CF I LTen t erOverviewn d ia nanguageorWe will highlight how NLP people are also well suited to workon Time Series problems.We will provide background information on Time SeriesForecasting.We will discuss some statistical approaches, some classicalmachine learning approaches, and some deep learningapproaches for time series forecasting.We will mention some commonalities between NLP and TimeSeries, and how one can assist the other.Sandhya and KevinTime Series Forecasting2

CF I LTen t erOutlinen d ia nanguageor1IntroductionTime SeriesTime Series Forecasting2BackgroundTime Series ComponentsTime Series CategorizationTime Series Forecasting Terminology3Statistical MethodsSimple ModelsAuto Regressive ModelsEvaluation Metrics4Classical ML ModelsPreparing DataML ModelsSandhya and KevinTime Series Forecasting3

CF I LTen t erOutlinen d ia nanguageor5Deep Learning Models6Connection with NLPProblem LevelTooling Level7DemosStatsmodel LibraryProphet Library8ConclusionSandhya and KevinTime Series Forecasting4

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAcknowledgementn d ia nanguageorWe thankThe ICON 2019 committee, for accepting our proposalLGSoft, for their joint project with CFILT. Our investigationsinto the same germinated the idea of this tutorialSandhya and KevinTime Series Forecasting5

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionSome Context Regarding the TutorialCF I LTen t ern d ia nanguageorWhy are we (people working in NLP) talking about TimeSeries?Text and Time Series - both are Sequential DataCommonality - Exploit structure we know about the problemin advanceThe sequential nature in this caseSimilar tools, different terminologyKnowing terminology will enable us to apply our knowledge oftools in this area tooSandhya and KevinTime Series Forecasting6

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erTarget Audiencen d ia nanguageorPeople who have not worked with Time Series dataSubsumesThose who are completely new to MLThose who have basic knowledge of how to apply MLThose who are proficient in ML and/or are working in NaturalLanguage ProcessingSandhya and KevinTime Series Forecasting7

IntroductionSandhya and KevinTime Series Forecasting8

IntroductionTime SeriesSandhya and KevinTime Series Forecasting8

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erWhat is a Time Series?n d ia nanguageorDefinitionA time series is a sequence of observations ordered in time.Xt ; t 0, 1, 2, 3The observations arecollected over a fixedtime intervalThe time dimensionadds structure andconstraint to the dataNifty 50 Index as of 13/10/2019 15:31 ISTSandhya and KevinTime Series Forecasting9

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionWhere (and When) does One Encounter Time Series?CF I LTen t ern d ia nanguageorAs a time series of our life !Economy and Finance: Exchange rates, Interest rates,Employment rate, Financial indicesMeteorology: Properties of weather like temperature, humidity,windspeed, etc.Medicine: Physiological signals (EEG), heart-rate, patienttemperature, etc.Other venues:Industry: Electric load, power consumption, resourceconsumptionWeb: Clicks, LogsSandhya and KevinTime Series Forecasting10

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erWhat is Time Series Analysis?n d ia nanguageorApplying statistical approaches to time series dataWill enable one toPredict future based on the pastUnderstand the underlying mechanism which generates thedataControl the mechanismDescribe salient features of the dataSandhya and KevinTime Series Forecasting11

IntroductionTime Series ForecastingSandhya and KevinTime Series Forecasting11

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosCF I LTen t erTime Series ForecastingExtrapolation inclassical statisticalterminologySandhya and Kevinn d ia nanguageorMonthPredict futurebased on 949-071949-081949-09No. of Passengers(in thousands)118132129121135?Time Series Forecasting12

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erForecasting - Yes or No?n d ia nanguageorDetermining whether tomorrow a stock will go up/down orstay put?Given a voice recording, who is the speaker?Given a voice recording, who is speaking after the currentspeaker?Given an ECG plot, is the heart functioning normal orabnormal?Given an ECG plot, predict whether the person will have aheart related issue in the next month.Sandhya and KevinTime Series Forecasting13

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erForecasting - Yes or No?n d ia nanguageorDetermining whether tomorrow a stock will go up/down orstay put? - YESGiven a voice recording, who is the speaker? - NOGiven a voice recording, who is speaking after the currentspeaker? - NOGiven an ECG plot, is the heart functioning normal orabnormal? - NOGiven an ECG plot, predict whether the person will have aheart - YES related issue in the next month.Sandhya and KevinTime Series Forecasting13

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionAdvantages of Time Series ForecastingCF I LTen t ern d ia nanguageorReliability:Given the forecast of power surges in your area, you can checkwhether your home’s wiring is reliable or not.Preparing for Seasons:Looking at the patterns from previous Christmas events, stockyour warehouse for the upcoming Christmas accordinglyGiven that the south east coast of India experiences typhoonsduring monsoons, pre-allocate rescue and relief resourcesEstimating trends:Given trend of a particular stock, should I invest in it?Sandhya and KevinTime Series Forecasting14

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erTime Series Forecasting and Machine Learningn d ia nanguageorForecasting - predicting the future from the pastGiven an observed value Y , predict Yt 1 using Y1 . . . YtIn other words, learn f such thatYt 1 f (Y1 , . . . , Yt )(1)Machine Learning practitioners should be easily able to relatethis expression toY f (X )(2)Are ML skills applicable? - YesSandhya and KevinTime Series Forecasting15

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAI/ML : NOT a Silver Bulletn d ia nanguageorAI/ML are multipliers, and not a silver bullet.Consider the example:EMNLP - Empirical Methods in Natural Language Processing a top tier NLP conferenceEMNLP 2015 was informally called Embedding Methods inNatural Language ProcessingThis is due to sheer number of papers about word embeddingsMore or less implying that word embeddings are the silverbulletIf that were the case, shouldn’t all problems be solved by now?Shouldn’t ACL, etc. close shops?That is not the caseDomain knowledge is still needed for proper utilization of MLSo lets discuss some background to gain time series domainknowledgeSandhya and KevinTime Series Forecasting16

BackgroundSandhya and KevinTime Series Forecasting17

BackgroundTime Series ComponentsSandhya and KevinTime Series Forecasting17

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erTime Series Componentsn d ia nanguageorLevelThe average value of a time seriesTrendA long term pattern present in the time seriesCan be positive, negative, linear or nonlinearIf no increasing or decreasing trend, then the time series isstationary.i.e. Data has constant mean and variance over timeSandhya and KevinTime Series Forecasting18

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosTime Series Components (contd.)ConclusionCF I LTen t ern d ia nanguageorSeasonalityRegular and Predictable changes that recur in regular shortintervalsLargely due to involvement of periodically occurring factorsCyclicalityChanges that recur in irregular intervalsAs opposed to fixed period intervals in seasonalityNoise / Irregularity / ResidualRandom variations that do not repeat in the patternSandhya and KevinTime Series Forecasting19

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosTime Series Components for Airline Passenger DataSandhya and KevinTime Series ForecastingConclusionCF I LTen t ern d ia nanguageor20

BackgroundTime Series CategorizationSandhya and KevinTime Series Forecasting20

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosCategorization of Time Series Problem FormulationConclusionCF I LTen t ern d ia nanguageorBased on the number of inputsUnivariate vs. MultivariateBased on the number of time steps predicted in the outputOne step forecasting vs. Multi step forecastingBased on the modeling of interactions between differentcomponentsAdditive vs. Multiplicative modelsSandhya and KevinTime Series Forecasting21

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erUnivariate Time Seriesn d ia nanguageorSingle Time Dependent VariableExamples:Monthly Airline Passenger DataMonth1949-021949-031949-041949-051949-06No. of Passengers(in thousands)118132129121135Sandhya and KevinTime Series Forecasting22

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMultivariate Time Seriesn d ia nanguageorMultiple time dependent variablesCan be considered as multiple univariate time series thatneeds to be analyzed jointlyExample: Rainfall 952Sandhya and 94Time Series 3

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCategorization of Time Series Problem FormulationCF I LTen t ern d ia nanguageorBased on the number of inputsUnivariate vs. MultivariateBased on the number of time steps predicted in the outputOne step forecasting vs. Multi step forecastingBased on the modeling of interactions between differentcomponentsAdditive vs. Multiplicative modelsSandhya and KevinTime Series Forecasting24

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionOne Step vs. Multi Step ForecastingCF I LTen t ern d ia nanguageorOne Step ForecastingGiven data upto time t, predict value only for the next onestep i.e. at t 1Multi Step ForecastingGiven data upto time t, predict values for two or more stepsi.e. at t 1, t 2, t 3, . . .Sandhya and KevinTime Series Forecasting25

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionOne Step vs. Multi Step Forecasting (contd.)One Step Prediction for 3 stepsSandhya and KevinCF I LTen t ern d ia nanguageorMulti Step Prediction for 3 stepsTime Series Forecasting26

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionOne Step vs. Multi Step Forecasting (contd.)One Step Prediction for 8 stepsCF I LTen t ern d ia nanguageorMulti Step Prediction for 8 stepsNote how close prediction is to true value in case of one steppredictionSandhya and KevinTime Series Forecasting27

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPOne Step vs. Multi Step Forecasting (contd.)DemosConclusionCF I LTen t ern d ia nanguageorThis guy should NOT use multi step forecastingImg Src: https://xkcd.com/605/Sandhya and KevinTime Series Forecasting28

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosCategorization of Time Series Problem FormulationConclusionCF I LTen t ern d ia nanguageorBased on the number of inputsUnivariate vs. MultivariateBased on the number of time steps predicted in the outputOne step forecasting vs. Multi step forecastingBased on the modeling of interactions between differentcomponentsAdditive vs. Multiplicative modelsSandhya and KevinTime Series Forecasting29

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive vs. Multiplicative Modelsn d ia nanguageorAdditive models:The series is additively dependent on the different componentsY Level Trend Seasonality Noise(3)Multiplicative models:The series is multiplicatively dependent on the differentcomponentsY Level Trend Seasonality NoiseSandhya and KevinTime Series Forecasting(4)30

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosAdditive vs. Multiplicative Models (contd.)ConclusionCF I LTen t ern d ia nanguageorComparison of Additive and Multiplicative SeasonalityImg Src: tive-and-multiplicative-seasonality/Sandhya and KevinTime Series Forecasting31

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting32

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorAdditiveImg Src: hya and KevinTime Series Forecasting32

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting33

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorMultiplicativeImg Src: hya and KevinTime Series Forecasting33

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting34

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorMultiplicativeImg Src: hya and KevinTime Series Forecasting34

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting35

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorAdditiveImg Src: hya and KevinTime Series Forecasting35

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting36

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorMultiplicativeImg Src: hya and KevinTime Series Forecasting36

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorImg Src: hya and KevinTime Series Forecasting37

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAdditive or Multiplicative?n d ia nanguageorAdditiveImg Src: hya and KevinTime Series Forecasting37

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionDealing with Multiplicative ModelsPassenger Data is MultiplicativeSandhya and KevinCF I LTen t ern d ia nanguageorLog(Passenger) Data is AdditiveTime Series Forecasting38

BackgroundTime Series Forecasting TerminologySandhya and KevinTime Series Forecasting38

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erCorrelationsn d ia nanguageorCaptures relation between two seriesr Corr (X , Y )Cov (X , Y ) σx σyE [(X µX )(Y µY )] σx σyImg Src:https://en.wikipedia.org/wiki/Correlation and dependenceSandhya and KevinTime Series Forecasting39

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erSpurious Correlationsn d ia nanguageorImg Src: ndhya and KevinTime Series Forecasting40

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAutocorrelationn d ia nanguageorCapturing relation between a series and a lagged version ofthe samePassenger dataSandhya and KevinAutocorrelation on PassengerdataTime Series Forecasting41

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erWhite Noisen d ia nanguageorWhite noise is a time series that ispurely random in natureLets denote it by tMean of white noise i.e. E [ t ] 0 andVariance is always constant t , k are uncorrelatedIf data is white noise, then intelligentforecasting is not possibleThe best would be to just returnmean as the predictionhttps://en.wikipedia.org/wiki/White noiseSandhya and KevinTime Series Forecasting42

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erStationarityn d ia nanguageorA time series is stationary if it does not exhibit any trend orseasonalityStationary Time SeriesSandhya and KevinNon-Stationary Time SeriesTime Series Forecasting43

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erStationarity (contd.)n d ia nanguageorStrict stationarityP(Yt ) P(Yt k ) and P(Yt , Yt k ) is independent of tMean and variance time invariantWeak StationarityIn this case, mean constant, variance constantCov (Y1 , Y1 k ) Cov (Y2 , Y2 k ) Cov (Y3 , Y3 k ) γi.e. Covariance only depends on lag value kSandhya and KevinTime Series Forecasting44

Statistical MethodsSandhya and KevinTime Series Forecasting45

Statistical MethodsSimple ModelsSandhya and KevinTime Series Forecasting45

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erNaive Forecastingn d ia nanguageorA dumb forecasting approachPredict Yt 1 Yti.e. Forecast that the next value is going to be the same asthe current valueSandhya and KevinTime Series Forecasting46

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionSimple Moving Average (SMA)CF I LTen t ern d ia nanguageorPrediction is the mean of a rolling window over previous datan1XYt Xt ini 1where n is the rolling window 950-03-01Thousands 583333128.333333Sandhya and KevinTime Series Forecasting47

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionSimple Moving Average (SMA) (contd.)CF I LTen t ern d ia nanguageorShortcomings of SMA:Smaller windows lead to more noise, rather than signalWill lag by window sizeCannot predict extreme values (due to averaging)Captures trend, but poor at capturing other components; poorat forecastingSandhya and KevinTime Series Forecasting48

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionExponential Weighted Moving Average (EWMA)CF I LTen t ern d ia nanguageorGives exponentially high weights to nearby values and lowweights to far off values while performing weighted averagingY0 X0Yt (1 α)Yt 1 αXtwhere α is a smoothing factor such that 0 α 1Sandhya and KevinTime Series Forecasting49

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosComparison between SMA and EWMAConclusionCF I LTen t ern d ia nanguageorOne can see seasonality better captured in EWMA ascompared to SMASandhya and KevinTime Series Forecasting50

Statistical MethodsAuto Regressive ModelsSandhya and KevinTime Series Forecasting50

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erAuto Regressive (AR) Modelsn d ia nanguageorIf the series is not white noise, then the forecasting can bemodeled asYt f (Y1 , . . . , Yt 1 , et )(5)Practically not feasible to consider all time stepsApproximation time !Yt β0 β1 Yt 1 t(6)Since we used 1 step, this is called AR(1) modelExtending to AR(p), we getYt β0 β1 Yt 1 β2 Yt 2 · · · βp Yt p tSandhya and KevinTime Series Forecasting(7)51

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMoving Average (MA) Modelsn d ia nanguageorConsider the modeling in ARYt f (Y1 , . . . , Yt 1 , et )(8)Prediction based on previous valuesIn MA models, we model upon the white noise observationsYt f (e1 , . . . , et 1 , et )(9)Using the previous analogy, an MA(q) model learnsYt γ0 t γ1 t 1 γ1 t 2 · · · γq t qSandhya and KevinTime Series Forecasting(10)52

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erARMA Modelsn d ia nanguageorARMA models combine both AR and MA modelsAn ARMA(p,q) models Yt using p previous values and qprevious noise componentsYt β0 β1 Yt 1 β2 Yt 2 · · · βp Yt p(11) t γ1 t 1 γ2 t 2 · · · γq t q(12)Sandhya and KevinTime Series Forecasting53

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionDifferencing: Converting Non-stationary to StationaryCF I LTen t ern d ia nanguageorA time series which is non-stationary can be converted to astationary time series by differencingYt0 Yt Yt 1If still not stationary, do second order differencing0Yt00 Yt0 Yt 1 Yt 2Yt 1 Yt 2Sandhya and KevinTime Series Forecasting54

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erARIMA Modelsn d ia nanguageorStands for Auto Regressive Integrated Moving AverageIn ARIMA, the AR and MA are same as ARMAHowever, I indicates the amount of difference doneIf differencing done once, it is called I(1)Thus an ARIMA(p,d,q) model is a combination of AR(p) andMA(q) with I(d)Sandhya and KevinTime Series Forecasting55

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erHow to Decide p, d, q ?n d ia nanguageorDifficult for a human - will have to look various plots, runsome tests, etc.Another approach - Auto ARIMALearns p,d, and q automaticallySandhya and KevinTime Series Forecasting56

Statistical MethodsEvaluation MetricsSandhya and KevinTime Series Forecasting56

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erEvaluation Metricsn d ia nanguageorStandard evaluation metrics for time series forecasting are;Mean Absolute Error (MAE)Mean Absolute Percentage Error (MAPE)Mean Squared Error (MSE)Root Mean Squared Error (RMSE)Normalized Root Mean Squared Error (NRMSE)Sandhya and KevinTime Series Forecasting57

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMean Absolute Error (MAE)n d ia nanguageornMAE 1X yj ŷj n(13)j 1Measures the average magnitude of the errorsIf MAE 0, then no errorUnable to properly alert when the forecast is very off for a fewpointsSandhya and KevinTime Series Forecasting58

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMean Absolute Percentage Error (MAPE)n d ia nanguageornMAPE 100% X yj ŷj nyj(14)j 1Percentage equivalent of MAENot defined for zero valuesSandhya and KevinTime Series Forecasting59

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMean Squared Error (MSE)n d ia nanguageornMSE 1X(yj ŷj )2n(15)j 1Measures the mean of the squared errorThose forecast values which are very off are penalized moreSquared values make it more difficult to interpret the errorsSandhya and KevinTime Series Forecasting60

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erRoot Mean Squared Error (RMSE)vu Xu1 nRMSE t(yj ŷj )2nn d ia nanguageor(16)j 1Value of the loss is of similar magnitude as that of thepredictionThereby making it more interpretableAlso punishes large prediction errorsSandhya and KevinTime Series Forecasting61

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erNormalized Root Mean Squared Error (NRMSE)q Pn1NRMSE nj 1 (yjn d ia nanguageor ŷj )2Z(17)where Z is the normalization factorNRMSE allows for comparison between models acrossdifferent datasetsCommon normalization factors:Mean: Preferred when same preprocessing and predictedfeatureRange: sensitive to sample sizeStandard Deviation: suitable across datasets as well aspredicted featuresSandhya and KevinTime Series Forecasting62

Classical ML ModelsSandhya and KevinTime Series Forecasting63

Classical ML ModelsPreparing DataSandhya and KevinTime Series Forecasting63

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosPreparing Time Series Data for Machine ee1e2e3e4e5e6e7e8e9e10e11e12Sandhya and KevinConclusionCF I LTen t ern d ia nanguageorFeature ofInterestx1x2x3x4x5x6x7x8x9x10x11x12Time Series Forecasting64

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosCF I LTen t erOne Step Forecasting e2e3e4e5e6e7e8e9e10e11e12n d ia nanguageorFeature ofInterestx1x2x3x4x5x6x7x8x9x10x11x12Sandhya and KevinConclusionForecast Featureof Interestx2x3x4x5x6x7x8x9x10x11x12NaNTime Series Forecasting65

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erRandom 1e2e3e4e5e6e7e8e9e10e11Feature ofInterestx1x2x3x4x5x6x7x8x9x10x11n d ia nanguageorForecast Featureof traFeaturee1e2e4e6e7e9e10Feature ofInterestx1x2x4x6x7x9x10Forecast Featureof Interestx2x3x5x7x8x10x11Table: Train SetTimet3t5t8t11ExtraFeaturee3e5e8e11Feature ofInterestx3x5x8x11Forecast Featureof Interestx4x6x9x12Table: Test SetSandhya and KevinTime Series Forecasting66

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erSequential 1e2e3e4e5e6e7e8e9e10e11Feature ofInterestx1x2x3x4x5x6x7x8x9x10x11n d ia nanguageorForecast Featureof xtraFeaturee1e2e3e4e5e6e7e8Feature ofInterestx1x2x3x4x5x6x7x8Forecast Featureof Interestx2x3x4x5x6x7x8x9Table: Train SetTimet9t10t11ExtraFeaturee9e10e11Feature ofInterestx9x10x11Forecast Featureof Interestx10x11x12Table: Test SetSandhya and KevinTime Series Forecasting67

IntroductionBackgroundStatisticalMethodsClassical MLModelsDeep LearningModelsConnectionwith NLPDemosConclusionCF I LTen t erMultiple Train-Test SplitTimet1t2t3ExtraFeaturee1e2e3Feature ofInterestx1x2x3Forecast Featureof Interestx2x3x4

Deep Learning Models Connection with NLP DemosConclusionReferences Time Series Forecasting and Machine Learning Forecasting - predicting the future from the past Given an observed value Y, predict Y t 1 using Y 1:::Y t In other words, learn f such that Y t 1 f(Y 1;:::;Y t) (1) Machine Learning practitioners should be easily able to relate .