Detection And Prevention Of SIP Flooding Attacks In Voice Over IP Networks

Transcription

2012 Proceedings IEEE INFOCOMDetection and Prevention of SIP Flooding Attacksin Voice over IP NetworksJin Tang, Yu Cheng and Yong HaoDepartment of Electrical and Computer EngineeringIllinois Institute of TechnologyEmail: {jtang9, cheng, yhao4}@iit.eduAbstractβ€”As voice over IP (VoIP) increasingly gains popularity, traffic anomalies such as the SIP flooding attacksare also emerging and becoming into a major threat to thetechnology. Thus, detecting and preventing such anomalies iscritical to ensure an effective VoIP system. The existing floodingdetection schemes are inefficient in detecting low-rate floodingfrom dynamic background traffic, or may even totally failwhen flooding is launched in a multi-attribute manner bysimultaneously manipulating different types of SIP messages.In this paper, we develop an online scheme to detect andsubsequently prevent the flooding attacks, by integrating a novelthree-dimensional sketch design with the Hellinger distance (HD)detection technique. The sketch data structure summarizes theincoming SIP messages into a compact and constant-size dataset based on which a separate probability distribution can beestablished for each SIP attribute. The HD monitors the evolutionof the probability distributions and detects flooding attacks whenabnormal variations are observed. The three-dimensional designequips our scheme with the advantages of high detection accuracyeven for low-rate flooding, robust performance under multiattribute flooding, and the capability of selectively discardingthe offending SIP messages to prevent the attacks. Moreover,we develop an estimation freeze mechanism to protect thedetection threshold from being polluted by attacks. Not onlydo we theoretically analyze the performance of the proposeddetection and prevention techniques, but also resort to extensivesimulations to thoroughly examine the performance.I. I NTRODUCTIONCompared to the traditional public switched telephone network (PSTN), voice over IP (VoIP) is a much more economictechnology, but with the tradeoff of more security concernsdue to its open infrastructure mainly based on the sessioninitiation protocol (SIP) [1] and the Internet protocol (IP).The SIP flooding attack is among the most severe of allbecause it is easy to launch and capable of quickly drainingthe resources of both networks and nodes. The attack disruptsperceived quality of service (QoS) and subsequently leads todenial of service (DoS). Furthermore, SIP is a transactionalprotocol and possesses multiple controlling message attributes;the flooding attacks can thus bear diverse forms and togetherinitiate the multi-attribute attack. In order to achieve a secureVoIP system, an anomaly defense system is desired to detectthe flooding attacks, classify the respective forms of them, andprevent the attacks from bringing damages to the services.Detecting anomalies from network traffic can be modeledas distinguishing odd traffic behavior from normal behaviorThis work was supported in part by NSF grant CNS-1117687.978-1-4673-0775-8/12/ 31.00 2012 IEEEwhich is estimated based on history information. Such approaches resemble anomaly detection in the subject of statistics [2], where measurements of the investigated data form atime series for analysis. In the case of flooding attack, an intuitive choice for such measurements can be traffic volume/ratesince an unreasonable volume/rate burst can well imply somemalicious behavior on the network [3], [4]. However, onemajor limitation of volume/rate-based monitoring is that lowrate flooding can hardly be distinguished from the normal ratefluctuation due to randomness. Fortunately, besides just minorvolume/rate changes, anomalies are likely to induce differentprobability distributions from the normal one, which revealsthe presence of anomalies. The Hellinger distance (HD) [5]is a well-known metric to describe the deviation betweentwo probability distributions, which has been used in [6] toimplement a flooding detection system with good sensitivity.However, the scheme in [6] establishes a probability distribution by monitoring the relative proportions of four types of SIPmessages associated with four SIP attributes within the totaltraffic. The detection method will become ineffective if thefour attributes are proportionally flooded simultaneously. Werefer to such an attack as multi-attribute attack in this paper.Also through investigation, we find that as there is a relativelylarge time difference between the BYE attribute and the otherthree attributes due to call holding times, dynamic normaltraffic arrivals can severely undermine the effectiveness of thescheme in [6]. Moreover, the scheme does not address theimportant issues of how to protect the detection threshold frombeing polluted by attacks and how to subsequently prevent theattacks after detection.In this paper, we develop a versatile scheme for detectingand preventing the SIP flooding attacks in VoIP networks,by integrating the sketch technique [3], [7] with the HDbased detection for a more effective and flexible solution.Sketch is capable of summarizing each of the incomingSIP messages into a compact and constant-size data set byrandom hash operations. Based on the sketch data set, wecan establish a probability distribution for each SIP attributeindependently, termed as sketch data distribution, which isthe cornerstone of our design. Especially, we design a genericthree-dimensional sketch: the sketch comprises multiple twodimensional attribute hash-tables (one for each SIP attribute),and each attribute table consists of multiple element hash-rows(one associated with a different hash function). The three-1161

dimensional sketch design allows us to apply HD detectionto examine the anomaly over each SIP attribute separatelyand therefore successfully resolve the multi-attribute attack.The multiple element hash-rows provision a voting schemeto improve detection accuracy. Also due to the separateexamination on each attribute, the time difference betweenthe attributes does not affect our scheme and we are able tomaintain high detection accuracy under dynamic normal trafficarrivals. Furthermore, the multiple hash-row design with anattribute table can be leveraged to identify the offending SIPmessages responsible for the flooding attack over the attributeunder consideration. We can then selectively discard thosemessages to efficiently prevent the attack. In addition, wedevelop an estimation freeze mechanism that can protect theHD threshold estimation from being impacted by the attacks.A side benefit of the estimation freeze mechanism is that thedurations of attacks can be identified.We theoretically prove that our detection scheme can detectthe flooding attack over a SIP attribute with high probability,assuming in an ideal case that the sketch data distributionassociated with normal traffic could be accurately measuredby a training data series. We also prove that when theHD indicates an attack, an entry in an element hash-rowwith a larger value than the estimated normal value mustbe associated with some offending SIP messages, which isthe theoretical foundation for our prevention scheme design. Performance of the proposed techniques is validatedthrough extensive simulations and comparisons to the existingSIP flooding detection solution. In summary, this paper hasfour-fold main contributions. (1) By exploiting the sketchtechnique, we decouple the probability model constructionfrom the specific SIP attributes, which significantly enhancesthe flexibility of the HD-based detection. (2) We design anovel three-dimensional sketch, which equips our schemewith the advantages of high detection accuracy even for lowrate flooding attacks, robust performance under multi-attributeflooding attacks, and the capability of selectively discardingthe offending SIP messages to efficiently prevent the attacks.(3) An estimation freeze mechanism is developed to protectthe detection threshold from being impacted by attacks anddetermine the attack durations. (4) We thoroughly examine theperformance of the proposed techniques through theoreticalanalysis and computer simulations.The remainder of the paper is organized as follows. Section II reviews more related work. Section III describes thesystem model. Section IV presents the proposed SIP floodingdetection and prevention scheme. Section V gives the performance evaluation results. Section VI provides discussions onrelated issues. Section VII concludes the paper.II. R ELATED W ORKIn the context of anomaly detection, several studies arebased on the classic time series forecasting analysis and outlierdetection [8]. Sketch [7] is a technique to summarize highdimensional data and provide scalable and flexible input to thetime series forecasting model. Krishnamurthy et al. [3] utilizesketch in detecting behavior changes. However, their approachis based on the traffic volume, and requires the operation ofretrieving data values for given keys from sketch even in thenormal condition. This can incur relatively high computationalcost. In our scheme, we do not perform such operation.Using the destination addresses to profile traffic is a common approach to address the DoS problem [9], [10]; eventhough the attackers can be distributed, their target is concentrated on the victim addresses. This causes the traffic atdestination addresses to significantly deviate from the normalcondition and thus the attack will be detected. However, suchan approach is not practical in the SIP case as the victim offlooding is usually a proxy server. The messages can be sentto the proxy server no matter what addresses are in the SIPdestination header. In our work, we use the source addressesto profile traffic. This allows us to both detect the floodingattacks and identify the offending messages efficiently.Surveys of the SIP security issues can be found in [14], [15].The schemes presented in [11], [12] work effectively to detectSIP flooding. In their work, SIP transactional models are builtto detect deviations from normal behaviors. However, theseschemes are customized specifically to the SIP protocol suiteand can not be easily generalized to other flooding detectioncases. Whereas in our scheme, we can use the attributesassociated with protocols other than SIP as keys to profiletraffic and have a generic method to detect flooding attacks.III. S YSTEM M ODELA. SIP-based VoIPVoIP [16] utilizes SIP [1] as the application-layer signalingprotocol to establish, manage and terminate communicationsessions. At the transport layer, SIP normally favors the userdatagram protocol (UDP) over the transmission control protocol (TCP) due to the simplicity of UDP and the connectionoriented nature of SIP itself. There are three basic componentsin a SIP environment, which are user agent client (UAC), useragent server (UAS) and SIP proxy server. These componentsare identified using the SIP address, which has a similar formto an email address, typically containing a username and ahost name, e.g., β€œsip:alice@iit.edu”. Messages are exchangedbetween these components to perform ordinary SIP operations.The SIP messages used to establish and terminate sessionsare basically INVITE, 200 OK, ACK and BYE. They are alsocalled the SIP methods or attributes. A UAC initiates a SIPsession by sending out an INVITE. Intermediate proxies lookover the destination SIP address in the message and forwardit to the destined UAS who will respond with a 200 OK.An ACK message then finishes the three-way handshake toestablish the session and media will go directly between theUAC and the UAS. When the session is finished after sometime (call holding time), it will be terminated by a BYEmessage from either of the calling parties.B. Threat ModelSIP is vulnerable to network anomalies such as the floodingattacks. These attacks can be easily mounted by utilizing vari-1162

C. Detection and Prevention SystemOur flooding detection and prevention system monitors theSIP messages arriving at a proxy server. We implement it in afirewall module, which can be deployed without modifying theproxy server. The system operation is based on two techniques,sketch and Hellinger distance.1) Sketch: The sketch data structure is a probabilistic datasummarization technique. It builds compact and constant-sizesummaries of high dimensional data streams through randomaggregation, by applying a hash function [17] to the data.Specifically, we consider that each data item consists of a keyπ‘˜π‘– and its associated value 𝑣𝑖 , represented as π‘Žπ‘– (π‘˜π‘– , 𝑣𝑖 ),for constructing a sketch. Data items whose keys are hashedto the same value will be put in the same entry in sketch andtheir values will be added up to obtain the value of that entry.In our scheme, we use the SIP address as the key, and theh1 (ki )H hashfunctionsous SIP traffic generators openly available on the Internet, e.g.,SIPp [13]. The victim SIP proxy servers can be overwhelmedor even crushed by a large number of SIP messages within ashort period of time.SIP utilizes multiple methods/attributes to manage sessions.This provides possibilities for the attackers to take advantageof the vulnerabilities of these attributes to launch differentforms of SIP flooding attacks. We describe some of theseattacks below. We see that a general detection/preventionsystem is desired to defend these attacks.1) INVITE Flooding: In this attack, thousands of INVITEmessages are generated and transmitted to the victim proxyservers which can barely support all of them. Moreover, beinga transactional protocol, SIP may require the intermediateproxy servers to maintain a state for each INVITE messagewhen they are expecting the associated 200 OK. Thus theresources of these victim proxy servers could be exhaustedalmost in real time if the attack rate is high enough.2) BYE Flooding: The BYE message is used to terminateSIP sessions. Therefore it can be utilized by the attackersto bring down ongoing VoIP phone calls. More severely, theattackers can just launch a brute force BYE flooding attackto prematurely tear down most ongoing sessions in a VoIPnetwork without the knowledge of the SIP addresses of thelegitimate users. Such flooding attacks will cause call dropsover a big range of users immediately.3) Multi-Attribute Flooding: Intelligent attackers canlaunch different forms of SIP flooding attacks together to thevictim proxy servers in a distributed manner. In this case, notonly will the resources of the proxy servers be exhausted, butall the ongoing sessions may also be torn down instantly at thesame time, which makes the multi-attribute flooding attacksdevastating to the VoIP service. Moreover, the attacks flood thefour SIP attributes simultaneously and thus do not change therelative proportions of the attributes. Therefore the existing SIPflooding detection solution [6] based on observing significantdeviations in such proportions will become ineffective againstthe multi-attribute flooding attacks.h2 (ki )hH (ki )K sketch entriesFig. 1.suteribattSIPIllustration of a three-dimensional sketch design.value associated with each key is set as 1 indicating one SIPattribute generated from that address.Using sketch makes our scheme scalable. No matter howmany users exist in the VoIP network, sketch is able to derive aconstant-size traffic summary. More importantly, sketch allowsus to construct a probability distribution based on the sketchentries, with no need to investigate the correlation amongdifferent SIP attributes as described in [6].2) Hellinger Distance: The Hellinger distance (HD) isused to measure the distance between two probability distributions [5]. To compute HD, suppose that we have twohistogram distributions on the same sample space, namely,P (𝑝1 , 𝑝2 , , 𝑝𝑛 ) and Q (π‘ž1 , π‘ž2 , , π‘žπ‘› ). The HDbetween the two distributions is defined as follow𝑛1 𝐻 2 (𝑃, 𝑄) ( 𝑝𝑖 π‘ž 𝑖 ) 2 .(1)2 𝑖 1It is not difficult to see that the HD will be up to 1 if thetwo probability distributions are totally different and down to0 if they are identical. This property provides a good approachto quantify the similarity of two data sets in either normal oranomalous situations. Recall that we aim to build an anomalydetection system which needs a statistical model to representthe normal traffic condition and raises alarms when abnormalvariations are observed. The property of HD makes it wellsuited to this role. A low HD value implies that there is nosignificant deviation in the current traffic observations and ahigh HD is a strong indication that anomalies have happened.IV. D ETECTION AND P REVENTION S CHEME D ESIGNIn this section we describe our scheme to detect and preventthe SIP flooding attacks. The scheme is based on integratingthe two techniques introduced in Section III, sketch andHellinger distance.A. Three-Dimensional DesignThe SIP flooding attack can bear different forms and thusinduce changes in multiple SIP attributes. We must be able toisolate the changes across the attributes, then discriminate thediverse attack forms and cope with the multi-attribute attack.Fig. 1 gives an illustration of our three-dimensional sketchdesign. The sketch comprises multiple two-dimensional attribute hash-tables, each of which is built for a SIP attribute. We build four such tables for the four SIP attributesinvestigated. An attribute hash-table consists of 𝐻 elementhash-rows, each of which is associated with a different hashfunction and has 𝐾 entries. We construct the hash functions1163

𝑃 (𝑛𝐾𝑛1 𝑛2, , ,).𝑁 𝑁𝑁(2)Similarly, we obtain a distribution 𝑄 based on the sketchfor the test period. Suppose that the values of the 𝐾 entries𝐾of the test sketch are π‘š1 , π‘š2 , , π‘šπΎ , with 𝑀 𝑖 1 π‘šπ‘– .We can have the distribution 𝑄 as𝑄 (π‘šπΎπ‘š1 π‘š2,, ,).𝑀 𝑀𝑀(3)The Hellinger distance of the above two distributions is thencalculated as 𝐾 1 π‘›π‘–π‘šπ‘– 22 ) .((4)𝐻 (𝑃, 𝑄) 2 𝑖 1𝑁𝑀We monitor the data stream by tracing the HD. Assume thatthere is no attack in the first training set, which initially represents the normal condition. To calculate the HD, we obtainthe β€œtest” distribution 𝑄 from the current time interval andthe ”training” distribution 𝑃 from the immediately preceding𝑇 time intervals. We continue this operation and move the testand training periods forward respectively at each time interval,as long as the HD is smaller than a threshold. Such a slidingwindow mechanism better estimates the pattern of the datastream than directly analyzing two consecutive individual timeintervals. It can well reflect the dynamics of the evolving trafficand smooth sudden fluctuations in normal traffic.All the 𝐻 hash-rows in an attribute hash-table independentlymonitor the data stream associated with a certain SIP attribute,following the same operation as described above. Similarly,in the three-dimensional sketch, the four attribute hash-tablesinvestigate the four SIP attributes separately and are preparedfor the attack detection.TrainingDetectTestdi T di T 1 ! di 2 di 1didi T 1 di T 2 ! di 1 didi 1di T 1 di T 2 ! di 1 didi 2di T 1 di T 2 ! di 1 didi 1 Ddi T 2 di T 3 ! di di 1 Ddi 2 DFreezeD intervalsusing independent random seeds [17], and therefore they areindependent from each other. The hash functions are keptsecret because the seeds are not known to others. The threedimensional sketch design allows us to separately summarizeeach of the SIP attributes. In the following, we first discusshow to calculate an HD based on each hash-row, and thendescribe the operation in the context of three-dimensionalsketch.We divide time into discrete intervals and each interval is ofa constant length 𝑑. The messages associated with a certain SIPattribute under consideration is indexed as a data stream. Thedata stream then passes through two periods: a training periodand a test period. The training period contains 𝑇 consecutivetime intervals and the test period is the (𝑇 1)th interval. Webuild two sketches, one for the training period and the otherfor the test period. The SIP address of each message is used askey for the data to be put into the sketch. Such two sketchescan generate two probability distributions for HD analysis.Based on the training set, we obtain a sketch data distribution 𝑃 . Suppose that the valuesof the 𝐾 entries are 𝐾𝑛1 , 𝑛2 , , 𝑛𝐾 , and we denote 𝑁 𝑖 1 𝑛𝑖 . Then we definethe distribution 𝑃 asResumedi T 3 di T 4 ! di 1 D di 2 D di 3 DFig. 2.Sliding window in estimation freeze mechanism.B. Threshold under Attack1) Detection Threshold: As we want to utilize HD tomodel the traffic behavior along time, a detection thresholdis needed to reflect the normal condition and be the actualindicator of anomalies. Since normal traffic behaviors alsofluctuate over time and the distribution obtained based onsketch may even not be stationary, the HD in the normalcondition will be non-zero and may dynamically change. Inorder to properly model the behavior, we adopt the exponentialweighted moving average (EWMA) method in our scheme tocompute a dynamic threshold.Let β„Žπ‘› denote the value of HD in the current time interval 𝑛.To smooth its fluctuation, we calculate an estimation average,𝐻𝑛 , of β„Žπ‘› as𝐻𝑛 (1 𝛼) 𝐻𝑛 1 𝛼 β„Žπ‘› .(5)Next, to have an estimate of how much 𝐻𝑛 deviates fromβ„Žπ‘› , we compute the current mean deviation 𝑆𝑛 as𝑆𝑛 (1 𝛽) 𝑆𝑛 1 𝛽 𝐻𝑛 β„Žπ‘› .(6)Then given values of 𝐻𝑛 and 𝑆𝑛 , we derive the estimated𝑇 β„Žπ‘Ÿπ‘’followingthreshold 𝐻𝑛 1𝑇 β„Žπ‘Ÿπ‘’ πœ† 𝐻𝑛 πœ‡ 𝑆𝑛 ,𝐻𝑛 1(7)where πœ† and πœ‡ are multiplication factors used to set a safemargin for the threshold. Due to the ability of HD to accuratelymonitor the difference between two probability distributions,proper values of these two parameters may greatly reducefalse alarms. The parameters 𝛼, 𝛽, πœ† and πœ‡ are all tunableparameters in the model. We set the initial values of themaccording to previous research [6] and tune them in ourexperiments to achieve desirable detection accuracy.2) Estimation Freeze Mechanism: When the HD obtainedfrom a certain element hash-row exceeds the threshold, anattack detection is registered. After this, if we continue theupdate according to (5), (6), (7), the threshold will be pollutedby the attack as the attacking traffic will be taken into accountin estimating the threshold. To avoid this from happening, wefreeze the threshold and keep it as a constant as long as the HD1164

Algorithm 1: Estimation Freeze MechanismInput: SIP attribute streamOutput: Duration of the anomaly 𝐷𝐷 0;𝑑 time interval length;anomaly starting time 𝑑1 0;anomaly ending time 𝑑2 0;if 𝐻𝐷 surpasses threshold then𝑑1 time of 𝐻𝐷 surpassing threshold;𝑑 2 𝑑1 ;freeze training set;freeze threshold;while 𝐻𝐷 threshold doproceed test set;calculate 𝐻𝐷 between test set and freezedtraining set;𝑑2 𝑑2 𝑑;end𝐷 𝑑 2 - 𝑑1 ;elseproceed training set;proceed test set;update threshold;endendreturn 𝐷;We illustrate a comparison between two thresholds under attack in the same traffic condition in Fig. 3. The leftone is estimated directly from HD without our estimationfreeze mechanism whereas the right one is obtained usingthe mechanism. We see that without freezing the thresholdgoes all the way up with HD when the attack is detected. Itis even much higher than HD after the detection and can notreflect the normal traffic condition. Obviously such a thresholdmechanism loses track of the attack after the initial detection.On the contrary, using our estimation freeze mechanism, thethreshold remains low and HD keeps high after the attack isdetected. Together they also explicitly determine the durationof the attack. This provides a very clear indication of the entireattack.HDThresholdHDThreshold0.5Hellinger Distance0.5Hellinger Distanceis above it. Also, to prevent the attacking traffic from enteringthe training set and thus keep the HD high only during attacks,we modify the sliding window mechanism. As shown in Fig.2, after an attack detection is registered at the (𝑖 1)th timeinterval 𝑑𝑖 1 , we freeze the current training set and only let thetest set proceed to the next time interval. This β€œone freezingone proceeding” action only ends when the HD goes belowthe threshold and the normal sliding window is then resumed.Overall, the above operations are illustrated in Algorithm 1,termed by us as the β€œestimation freeze mechanismβ€œ. As a sidebenefit of the mechanism, we can determine the attack duration𝐷 because the HD is above the threshold all through the attackand immediately comes down right afterwards.0.40.30.20.100.40.30.20.1406080Time (Intervals)Fig. 3.1000406080100Time (Intervals)Comparison of thresholds under attack.C. Attack DetectionAs described above, to actually detect possible attacks, theHD associated with a certain hash-row will be computed between the sketch data distribution constructed from the testingset and that constructed from the training set. In an idealcase, assuming that the normal probability distribution couldbe accurately measured from the training set, the threshold fordetection can be set as 0. We have the following theorem.Theorem 1: A flooding attack over a SIP attribute can bedetected with high probability by computing the HD betweensketch data distributions, assuming that the normal probabilitydistribution could be accurately measured from the trainingset.Proof: Consider an element hash-row in the attributehash-table under investigation. Suppose that the hash-row has𝐾 entries. The total volume of normal traffic in the testingset is 𝑀 , which is distributed into the 𝐾 entries according𝐾to 𝑀 𝑖 1 π‘šπ‘– , with π‘šπ‘– denoting the volume countedby the 𝑖th entry. Assume that there is a flooding attack withtotal volume 𝑀 β€² added over the normal traffic in the testingset, which β€²is distributed to 𝐾 β€² ( 𝐾) entries according to 𝐾𝑀 β€² 𝑖 1 π‘šβ€²π‘– . Let 𝑝𝑖 denote the probability mass of entry𝑖𝑖, and 𝑝𝑖 π‘šπ‘€ in the normal situation. Assume that theentry is contaminated by the attackingtraffic, the probabilityπ‘š π‘šβ€²mass will then be 𝑝′𝑖 𝑀𝑖 𝑀 ′𝑖 . Assume that the trainingset can accurately monitor the normal probability distributionand the testing set is consistent with such a distribution. Theperformance of the HD-based detection is then determined bythe relation between 𝑝𝑖 and 𝑝′𝑖 as π‘šπ‘–π‘šπ‘– π‘šβ€²π‘– π‘šπ‘– 𝑀 β€² π‘šβ€²π‘– 𝑀 𝑝𝑖 𝑝′𝑖 𝑀𝑀 𝑀 β€² 𝑀 (𝑀 𝑀 β€² ) π‘šπ‘– π‘šβ€²π‘– β€² 𝑀𝑀 (8) . 1 𝑀/𝑀 β€² Given a threshold of 0, the attacker needs to set the distributionof the flooding traffic exactly as the normal distribution toavoid being detected. A significant benefit of utilizing thesketch data distribution is that the hash functions used by thedetection system will be kept secret to users. Therefore, it ishard for the attackers to estimate the normal sketch data distribution even if they can monitor the raw user data distribution.Furthermore, the detection system can dynamically change the1165

sketch hash functions for a higher level of security. If theattacker attempts to guess the normal sketch data distributionπ‘šπ‘–π‘€ , the probability of guessing the correct value will be low,𝑖because the value of π‘šπ‘€ in a given entry can be considered asa continuous random variable. In other words, our detectionsystem can detect the attack with high probability.Theorem 1 demonstrates the ideal performance under accurate distribution modeling. Practically, since random aggregation of sketch brings information loss and normal trafficitself is dynamic, the normal probability distribution maychange over time. Thus we can not monitor it that ideally inpractice and detection accuracy may be impacted. However,the analysis shows us that attacks will indeed disturb theprobability distribution obtained from the test set and as aresult cause HD to rise.In an attribute hash-table, each element hash-row registersattacks independently when its associated HD exceeds thedetection threshold. To increase detection confidence andassure high accuracy, we apply a voting procedure: if at least𝑧 percent of the 𝐻 rows in an attribute hash-table registerattacks, a flooding attack alarm is finally raised.D. Attack PreventionAfter detecting the flooding attack, the next step is toidentify the offending SIP messages and discard them toprevent the attack from reaching the proxy servers. In order toachieve this, we first identify the anomalous sketch entries thatcontain the offending messages in each row. Assuming that thenormal probability distribution could be accurately measuredfrom the training set, we have the following theorem.Theorem 2: In a flooding attack context, when the HDbased detection indicates an attack, there must exist entriesin a sketch hash-row for the testing set which has a largerprobability mass than that in the corresponding entry for thetraining set, and such entries are definitely associated withcertain offending SIP messages.Proof: In the normal situation, we assume that the normalprobability distribution could be accurately measured fromthe training set and the testing set is consistent with the𝑛𝑖𝑖distribution. Thus, we have π‘šπ‘€ 𝑁 . In the context underattack, the probability mass deviation in an entry 𝑖 οΏ½ 𝑀 𝑝′𝑖 𝑀 𝑁𝑀1 𝑀/𝑀 β€²(9)according to (8). When the HD detection indicates an attack,there must exist entries where 𝑝′𝑖 𝑛𝑁𝑖 . Moreover, in suchentries, we must have 𝑝′𝑖 𝑛𝑁𝑖 for some of them and 𝑝′𝑖 𝑛𝑁𝑖 𝐾 β€²for others; otherwise the condition that𝑖 1 𝑝𝑖 1 couldnot be maintained. In those entries with 𝑝′𝑖 𝑛𝑁𝑖 , the itemπ‘šβ€²associated with offending messages 𝑀𝑖′ must exist. However,the entries with 𝑝′𝑖 𝑛𝑁𝑖 may not include offending messages.The reason is that the attacking traffic might only occupya subset of the entries in a hash-row, i.e., 𝐾 β€² 𝐾. In theleftover 𝐾 𝐾 β€² entries, π‘šβ€²π‘– 0 and offending messages arenot included.According to Theorem 2, we mark entries whose probabilityincreases as possible anomalous entries. Suppose that we have𝑝𝑖 as the probability mass of the π‘–π‘‘β„Ž entry in one row fromthe traini

in Voice over IP Networks Jin Tang, Yu Cheng and Yong Hao Department of Electrical and Computer Engineering Illinois Institute of Technology Email: {jtang9, cheng, yhao4}@iit.edu Abstractβ€”As voice over IP (VoIP) increasingly gains pop-ularity, traffic anomalies such as the SIP flooding attacks are also emerging and becoming into a major .