Complex Networks - SFU

Transcription

Complex NetworksLjiljana Trajkovićljilja@cs.sfu.caCommunication Networks Laboratoryhttp://www.sfu.ca/ ljilja/cnlSchool of Engineering ScienceSimon Fraser University, Vancouver,British Columbia, Canada

Simon Fraser UniversityBurnaby CampusSeptember 12, 2019SISY 2019, Subotica, Serbia2

RoadmapnnnnnnIntroductionData processingMachine learning modelsExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia3

RoadmapnnnnnnIntroduction:n Complex networksn Machine learningData processingMachine learning modelsExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia4

Complexity SciencesSeptember 12, 2019SISY 2019, Subotica, Serbia5

The Internethttps://en.wikipedia.org/wiki/Complex network#/media/File:Internet map 1024.jpgBy The Opte Project - Originally from the English ?curid 1538544September 12, 2019SISY 2019, Subotica, Serbia6

The Internet Partial map of the Internet based on the January 15, 2005data found on opte.org. Each line is drawn between two nodes, representing two IPaddresses. The length of the lines are indicative of the delay betweenthose two nodes. This graph represents less than 30% of the Class Cnetworks reachable by the data collection program in early2005. Lines are color-coded according to their corresponding RFC1918 allocation as follows: Dark blue: net, ca, us; Green:com, org; Red: mil, gov, edu; Yellow: jp, cn, tw, au, de;Magenta: uk, it, pl, fr; Gold: br, kr, nl; White: unknown.September 12, 2019SISY 2019, Subotica, Serbia7

Scale-Free curid 29364647September 12, 2019SISY 2019, Subotica, Serbia8

Scale-Free Network An example of complex scale-free network. Graph represents the metadata of thousands of archivedocuments, documenting the social network of hundreds ofLeague of Nations personals. M. Grandjean, "La connaissance est un réseau," LesCahiers du Numérique, vol. 10, no. 3, pp. 37-54, 2014.September 12, 2019SISY 2019, Subotica, Serbia9

RoadmapnnnnnnIntroduction:n Complex networksn Machine learningData processingMachine learning modelsExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia10

Machine LearningnnnUsing machine learning techniques to detect networkintrusions is an important topic in cybersecurity.Machine learning algorithms have been used tosuccessfully classify network anomalies and intrusions.Supervised machine learning algorithms:n Support vector machine: SVMn Long short-term memory: LSTMn Gated recurrent unit: GRUn Broad learning system: BLSSeptember 12, 2019SISY 2019, Subotica, Serbia11

RoadmapnnnnnnIntroductionData processing:n BGP datasetsn NSL-KDD datasetn CICIDS2017n CSE-CIC-IDS2018Machine learning modelsExperimental procedurePerformance evaluationConclusion and ReferencesSeptember 12, 2019SISY 2019, Subotica, Serbia12

BGP and NSL-KDD Datasets§ Used to evaluate anomaly detection and intrusiontechniques§ BGP:§ Routing records from Réseaux IP Européens (RIPE)§ BCNET regular traffic§ NSL-KDD:§ an improvement of the KDD’99 dataset§ used in various intrusion detection systems (IDSs)September 12, 2019SISY 2019, Subotica, Serbia13

CICIDS2017 and CSE-CIC-IDS2018§ CICIDS2017 and CSE-CIC-IDS2018:§ Testbed used to create the publicly available datasetthat includes multiple types of recent cyber attacks.§ Network traffic collected between:§ Monday, 03.07.2017§ Friday, 07.07.2017§ Wednesday, 14.02.2018§ Friday, 02.03.2018September 12, 2019SISY 2019, Subotica, Serbia14

BGP Datasets§ Anomalous data: days of the attack§ Regular data: two days prior and two days after theattack§ 37 numerical features from BGP update messages§ Best performance: 60% for training and 40% for testingRegular(min)Anomaly(min)Code Red ber 12, 2019Regular Anomaly(training) 772,12311,3993,2095313121339SISY 2019, Subotica, Serbia15

NSL-KDD Dataset§ KDDTrain and KDDTest : training and test datasets§ KDDTes-21: a subset of the KDDTest dataset that doesnot include records correctly classified by 21 modelsRegularDoSU2RR2LProbeTotalKDDTrain 67,34345,9275299511,656125,973KDDTest 002,7542,40211,850September 12, 2019SISY 2019, Subotica, Serbia16

CICD2017 Dataset:Types of Intrusion AttacksAttackLabelDayNumber of intrusionsBrute forceFTP, SSHTuesday7,935; 5,897HeartbleedHeartbleedWednesday11Web attackBrute force,XSS, SQLInjectionThursdaymorning1,507; 652; wloris, Hulk,GoldenEye,SlowHTTPTestWednesday5,796; 230,124; 10,293; 5,499DDosDDoSFridayafternoon128,027September 12, 2019Thursday andFridayafternoonsFriday morning36; 158,9301,956SISY 2019, Subotica, Serbia17

CICD2017 Dataset: Number of FlowsDaySeptember 12, 2019Valid 9Wednesday691,406692,703Thursday (morning)170,231170,366Thursday (afternoon)288,395288,602Friday (morning)190,911191,033Friday (afternoon, PortScan)286,096286,467Friday (afternoon, DDoS)225,711225,745SISY 2019, Subotica, Serbia18

CICD2018 Dataset:Types of Intrusion AttacksDateAttackDay14.02.2018 FTP-BF, SSH-BFWednesdayDoS-GE, S-HOICWeb-BF, XSS-BF,22.02.2018ThursdaySQL InjectionWeb-BF, XSS-BF,23.02.2018FridaySQL Injection28.02.2018 InfiltrationWednesday01.03.2018 InfiltrationThursday02.03.2018 BotFridaySeptember 12, 2019Numberof benigninstances667626380949Number ofintrusionsSISY 2019, Subotica, 238037762384613104331125104857519

RoadmapnnnnnnIntroductionData processingMachine learning models:n Deep learning: multi-layer recurrent neural networksn Broad learning systemExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia20

Deep Learning Neural Networkn37 (BGP)/109 (NSL-KDD) RNNs, 80 FC1, 32 FC2, and16 FC3 fully connected (FC) hidden nodes:September 12, 2019SISY 2019, Subotica, Serbia21

Long Short-Term MemorynRepeating module for the Long Short-Term Memory(LSTM) neural network:September 12, 2019SISY 2019, Subotica, Serbia22

Long Short-Term Memory: LSTMThe outputs of the forget gate 𝑓! , the input gate 𝑖! ,and the output gate 𝑜! at time t are:𝑓! 𝜎 𝑊"# 𝑥! 𝑏"# 𝑈 # ℎ!%& 𝑏 #𝑖! 𝜎 𝑊"" 𝑥! 𝑏"" 𝑈 " ℎ!%& 𝑏 "𝑜! 𝜎(𝑊"' 𝑥! 𝑏"' 𝑈 ' ℎ!%& 𝑏 ' ),where:𝜎 . : logistic sigmoid function𝑥! : current input vectorℎ!%&: previous output vector𝑊"# , 𝑈 # , 𝑊"" , 𝑈 " , 𝑊"' and 𝑈 ' : weight matricesn𝑏"# , 𝑏 # , 𝑏"" , 𝑏 " , 𝑏"' , and 𝑏 ' : bias vectorsSeptember 12, 2019SISY 2019, Subotica, Serbia23

Long Short-Term Memory: LSTMOutput 𝑖! of the input gate decides if the informationwill be stored in the cell state. The sigmoid function isused to update the information.n Cell state 𝑐! :𝑐! 𝑓! 𝑐!%& 𝑖! 𝑡𝑎𝑛ℎ 𝑊"( 𝑥! 𝑏"( 𝑈 ( ℎ!%& 𝑏 ( ,where:n denotes element-wise multiplicationsn 𝑡𝑎𝑛ℎ function: used to create a vector for the nextcell staten Output of the LSTM cell:ℎ! 𝑜! 𝑡𝑎𝑛ℎ(𝑐! )nSeptember 12, 2019SISY 2019, Subotica, Serbia24

Gated Recurrent UnitnRepeating module for the Gated Recurrent Unit (GRU)neural network:September 12, 2019SISY 2019, Subotica, Serbia25

Gated Recurrent Unit: GRUThe outputs of the reset gate 𝑟! and the update gate 𝑧!at time t:𝑟! 𝜎 𝑊") 𝑥! 𝑏") 𝑈 ) ℎ!%& 𝑏 )𝑧! 𝜎(𝑊"* 𝑥! 𝑏"* 𝑈 * ℎ!%& 𝑏 * ),where:n 𝜎: sigmoid functionn 𝑥! : input, ℎ!%& is the previous output of the GRU celln 𝑊") , 𝑈 ) , 𝑊"* , and 𝑈 * : weight matricesn 𝑏") , 𝑏 ) , 𝑏"* , and 𝑏 * : bias vectorsnSeptember 12, 2019SISY 2019, Subotica, Serbia26

Gated Recurrent Unit: GRUOutput of the GRU cell:ℎ! 1 𝑧! 𝑛! 𝑧! ℎ!%&,where 𝑛! :n 𝑛! 𝑡𝑎𝑛ℎ(𝑊" 𝑥! 𝑏" 𝑟! (𝑈 ℎ!%& 𝑏 ))n 𝑊" and 𝑈 : weight matricesn 𝑏" and 𝑏 : bias vectorsnSeptember 12, 2019SISY 2019, Subotica, Serbia27

RoadmapnnnnnnIntroductionData processingMachine learning models:n Deep learning: multi-layer recurrent neural networksn Broad learning systemExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia28

Broad Learning SystemnModule of the Broad Learning System (BLS) algorithmwith increments of mapped features, enhancementnodes, and new input data:September 12, 2019SISY 2019, Subotica, Serbia29

Original BLSnMatrix 𝑨, is constructed from groups of mappedfeatures 𝒁 and groups of enhancement nodes 𝑯- as:𝐴! 𝒁" 𝑯# ] 𝜙 𝑿𝑾 ! 𝛽 ! 𝜉(𝒁"! 𝑾%" 𝛽%" ) ,where: 𝑖 1, 2, , 𝑛 𝑎𝑛𝑑 𝑗 1, 2, , 𝑚n 𝜙 and 𝜉: projection mappingsn 𝑾. , 𝑾 : weights!"n𝛽.! , 𝛽 " : bias parametersModified to include additional mapped features 𝒁 /&,enhancement nodes 𝑯-/&, and/or input nodes 𝑿0September 12, 2019SISY 2019, Subotica, Serbia30

Original BLSnMoore-Penrose pseudo inverse of matrix 𝑨, iscomputed to calculate the weights of the output:# &𝑾#" [𝑨" ] 𝒀nDuring the testing process, data labels are deducedusing the calculated weights 𝑾 , mapped features 𝒁 ,and enhancement nodes 𝑯- :#𝒀 𝑨#" 𝑾" 𝒁𝟏 , , 𝒁" 𝑯( , , 𝑯# ]𝑾#"September 12, 2019SISY 2019, Subotica, Serbia31

RBF-BLS ExtensionnnThe RBF function is implemented using Gaussian kernel: 𝑥 𝑐 2𝜉 𝑥 𝑒𝑥𝑝 𝛾2Weight vectors of the output 𝑯𝑾 are deduced from:𝑾 (𝑯1 𝑯)%&𝑯1 𝒀 𝑯/𝒀,where:n 𝑾 𝜔& , 𝜔2 , , 𝜔3 : output weightsn 𝑯 𝜉& , 𝜉2 , , 𝜉3 : hidden nodes/n 𝑯 : pseudoinverse of 𝑯September 12, 2019SISY 2019, Subotica, Serbia32

Cascades of Mapped FeaturesSeptember 12, 2019SISY 2019, Subotica, Serbia33

Cascades of Mapped FeaturesnnCascade of mapped features (CFBLS):the new group of mapped features is created by usingthe previous group (𝑘 1).Groups of mapped features are formulated as:𝒁3 𝜙(𝒁3%&𝑾.# 𝛽.# ) 𝜙 3 𝑿 ; {𝑾.! , 𝛽.! }3"4& , 𝑓𝑜𝑟 𝑘 1, , 𝑛September 12, 2019SISY 2019, Subotica, Serbia34

Cascades of Enhancement NodesSeptember 12, 2019SISY 2019, Subotica, Serbia35

Cascades of Enhancement NodesnnThe first enhancement node in cascade ofenhancement nodes (CEBLS) is generated frommapped features.The subsequent enhancement nodes are generatedfrom previous enhancement nodes creating a cascade:𝜉5𝒁 5𝐻5 ; 𝑾 ! , 𝛽 !, 𝑓𝑜𝑟 𝑢 1, , 𝑚,"4&where:n 𝑾 and 𝛽 : randomly generated!!September 12, 2019SISY 2019, Subotica, Serbia36

Cascades with Incremental LearningSeptember 12, 2019SISY 2019, Subotica, Serbia37

RoadmapnnnnnnIntroductionData processing:Machine learning models:Experimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia38

Intrusion Detection SystemnArchitecture:September 12, 2019SISY 2019, Subotica, Serbia39

Experimental Procedure§ Step 1: Normalize training and test datasets.§ Step 2: Train the RNN models and BLS using 10-foldvalidation. Tune parameters of the RNN and BLSmodels.§ Step 3: Test the RNN and BLS models.§ Step 4: Evaluate models based on:§ Accuracy§ F-Score*RNN: recurrent neural network*BLS: broadd learning systemSeptember 12, 2019SISY 2019, Subotica, Serbia40

Most Relevant Features§ CICIDS 2017: 16 most relevant featuresSeptember 12, 2019SISY 2019, Subotica, Serbia41

Most Relevant Features§ CSE-CIC-IDS2018: 16 most relevant featuresSeptember 12, 2019SISY 2019, Subotica, Serbia42

Number of BLS Training ParametersParametersCode Red INimdaSlammerNSL-KDD10050010010011255Enhancement nodes500700300100Incremental learningsteps10923Data 5060Mapped featuresGroups of mappedfeaturesSeptember 12, 2019SISY 2019, Subotica, Serbia43

Number of BLS Training r of BLSMapped features201010202015Groups ofmapped delSeptember 12, 2019SISY 2019, Subotica, Serbia44

Number of Incremental BLSTraining r of featuresIncremental BLSModel78CFBLS6432CFEBLS CEBLS786432BLSCEBLSBLSMapped features102010152010Groups of 4020Incrementallearning 020202020Data points/stepEnhancementnodes/stepSeptember 12, 2019SISY 2019, Subotica, Serbia45

RoadmapnnnnnnIntroductionData processing:n BGP datasetsn NSL-KDD datasetMachine learning models:n Deep learning: multi-layer recurrent neural networksn Broad learning systemExperimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia46

Training Time: RNN ModelsDatasetsLSTM2LSTM3LSTM4GRU2GRU3GRU4Python (CPU)Time (s)BGP Python (GPU)BGP 93355.86394.55317.53345.04369.86Time (s)September 12, 2019SISY 2019, Subotica, Serbia47

Training Time: BLS ModelsDatasetsBLSRBF-BLSCFBLSCEBLSCFEBLSPython (CPU)Time (s)BGP 798.13108.23108.14MATLAB (CPU)Time (s)BGP 888.95September 12, 2019SISY 2019, Subotica, Serbia48

LSTM Models: BGP DatasetsAccuracy (%)ModelLSTM2LSTM3LSTM4September 12, 2019F-Score estCode Red mer92.9892.9985.9772.42Code Red mer90.9092.0184.3867.29Code Red mer92.4992.2286.1870.72SISY 2019, Subotica, Serbia49

GRU Models: BGP DatasetsAccuracy (%)ModelGRU2GRU3GRU4September 12, 2019F-Score estCode Red mer91.8893.3390.9069.42Code Red mer91.7695.2190.8368.72Code Red mer92.1492.1590.3570.11SISY 2019, Subotica, Serbia50

BLS Models: BGP DatasetsAccuracy (%)ModelBLSRBF-BLSSeptember 12, 2019F-Score estCode Red mer87.6575.6268.4057.68Code Red mer91.2190.5570.7664.57SISY 2019, Subotica, Serbia51

BLS Models: BGP DatasetsAccuracy (%)ModelCFBLSCEBLSCFEBLSSeptember 12, 2019F-Score estCode Red mer89.2871.2561.8160.99Code Red mer91.0187.7182.4366.38Code Red mer86.3671.1157.7155.30SISY 2019, Subotica, Serbia52

RNN and BLS Models:NSL-KDD DatasetAccuracy (%)F-Score (%)ModelKDDTest KDDTest-21KDDTest 3.0574.06CFBLS82.2067.4782.2376.29September 12, 2019SISY 2019, Subotica, Serbia53

BLS Model:CICIDS2017 and CSE-CIC-IDS2018 DatasetsNumber 1898.8392.26CEBLS33.46DatasetModelTrainingtime (s)BLS786432September 12, 2019SISY 2019, Subotica, Serbia54

Incremental BLS Model:CICIDS2017 and CSE-CIC-IDS2018 DatasetsNumber offeaturesAccuracy(%)F-Score(%)ModelTrainingtime al BLS786432September 12, 2019SISY 2019, Subotica, Serbia55

Performance: BLS and IncrementalBLS, CICIDS2017September 12, 2019SISY 2019, Subotica, Serbia56

Performance: BLS and IncrementalBLS, CSE-CIC-IDS2018September 12, 2019SISY 2019, Subotica, Serbia57

RoadmapnnnnnnIntroductionData processing:Machine learning models:Experimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia58

ConclusionnnWe evaluated performance of:n LSTM and GRU deep recurrent neural networkswith a variable number of hidden layersn BLS models that employ radial basis function(RBF), cascades of mapped features andenhancement nodes, and incremental learningBLS and cascade combinations of mapped featuresand enhancement nodes achieved comparableperformance and shorter training time because oftheir wide and deep structure.September 12, 2019SISY 2019, Subotica, Serbia59

ConclusionnnBLS models:n consist of a small number of hidden layers andadjust weights using pseudoinverse instead ofback-propagationn dynamically update weights in case of incrementallearningn better optimized weights due to additional datapoints for large datasets (NSL-KDD)While increasing the number of mapped features andenhancement nodes as well as mapped groups led tobetter performance, it required additional memoryand training time.September 12, 2019SISY 2019, Subotica, Serbia60

RoadmapnnnnnnIntroductionData processing:Machine learning algorithms:Experimental procedurePerformance evaluationConclusion and referencesSeptember 12, 2019SISY 2019, Subotica, Serbia61

References: DatasetsnnnnnBCNET :http://www.bc.net/RIPE RIS raw ments/routinginformation-service-risNSL-KDD CIDS2017 tmlCSE-CIC-IDS2018 tmlSeptember 12, 2019SISY 2019, Subotica, Serbia62

References: Intrusion DetectionnPython:Pandas: https://pandas.pydata.org/PyTorch: earning: http://www.broadlearning.ai/nnnV. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: a survey,”ACM Comput. Surv., vol. 41, no. 3, pp. 15:1–15:58, July 2009.M. C. Libicki, L. Ablon, and T.Webb, The Defenders Dilemma: Charting aCourse Toward Cybersecurity. Santa Monica, CA, USA: RANDCorporation, 2015.N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi, “A deep learning approachto network intrusion detection,” IEEE Trans. Emerg. Topics Comput. Intell.,vol. 2, no. 1, pp. 41–50, Feb. 2018.September 12, 2019SISY 2019, Subotica, Serbia63

References: Deep LearningnnnnnnS. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol.9, no. 8, pp. 1735–1780, Oct. 1997.G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov,“Improving neural networks by preventing co-adaptation of feature detectors,”Computing Research Repository (CoRR), abs/1207.0580, pp. 1–18, Jul. 2012.K. Cho, B. van Merriënboer, C. Gülçehre, D. Bahdanau, F. Bougares, H. Schwenk,andY. Bengio, “Learning phrase representations using RNN encoder–decoder forstatistical machine translations,” in Proc. 2014 Conf. Empirical Methods NaturalLang. Process., Doha, Qatar, Oct. 2014, pp. 1724–1734.D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” in Proc. 3rdInt. Conf. Learn. Representations, San Diego, CA, USA, May 2015, pp. 1–15.K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber,“LSTM: a search space odyssey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28,no. 10, pp. 2222–2232, Oct. 2017.I. Goodfellow, Y. Bengio ,and A. Courville, Deep Learning. Cambridge, MA, USA:The MIT Press, 2016.September 12, 2019SISY 2019, Subotica, Serbia64

References: Broad Learning Syst

Machine Learning n Using machine learning techniques to detect network intrusions is an important topic in cybersecurity. n Machine learning algorithms have been used to successfully classify network anomalies and intrusions. n Supervised machine learning algorithms: n Support vector machine: SVM n Long short-term memory:LSTM n Gated recurre