ISSN ONLINE(2319-8753)PRINT(2347-6710)
A.Aafreen, Kannan Balasubramanian,M.Tech, Ph.D.
|
Related article at Pubmed, Scholar Google |
Visit for more related articles at International Journal of Innovative Research in Science, Engineering and Technology
Indirect attack has been a serious threat to server security due to their covert nature. Web proxy Distributed Denial of Service Attack is an increasingly common internet phenomenon and is capable of making the internet services unavailable. Such type of attack cannot be easily discovered by most existing defense systems since malicious traffic is hidden in the aggregated traffic. Also the source of the attack traffic and normal traffic cannot be distinguished, because both of them share the same IP of the proxy server. To overcome this problem, a new improved Hidden semimarkov model is proposed. Therefore applying this proposed method protects the origin server from the web proxy based HTTP attacks. Web proxy‘s access behavior can be regarded as the combination of the externally observable behavior and the internal driving mechanism. The internal driving mechanism can be estimated by the observable features of proxy-to-server traffic through the Hidden semi-markov model. Hidden semi-markov model describes the dynamic behavior process of the aggregated traffic. The false positive rate is also detected with respect to the incoming traffic.
Keywords |
Traffic analysis, traffic modelling, distributed denial of service attack, attack detection, attack response. |
INTRODUCTION |
A Distributed Denial-of-Service (DDoS) attack is an attempt to make a computer resource unavailable to its intended user. One common method of attack involves saturating the target machine with external communication requests, so much so that it cannot respond to legitimate traffic or responds so slowly as to be rendered essentially unavailable. |
There is other type of covert DDoS attacks (i.e) web proxy based distributed denial of service attacks. The key point of WPDDoS (Web Proxy Distributed Denial of Service Attack) of attack is that an attacker may exploit the communication mechanism of proxy servers to attack the victim via the ready-made hierarchical web proxy network [1][2][3].This means that any publicly accessible internet proxy server may be passively involved in the WPDDoS attack events and may unconsciously act as an attacker. In the actual environment, an attacker can easily turn a proxy server into an attack tool by forcing the proxy to forward its attack HTTP requests to the victim server. |
Thus a single attacker can simultaneously trigger a lot of proxy servers to attack a web server without the need of invading them. The thread of WPDDoS mainly comes from the following. |
1. These attacks are based on the HTTP protocol. Thus the attack traffic can pass through most of current border firewalls. Moreover, most of the existing detection systems designed for TCP-or IP-layer DDoS attacks are vulnerable to effectively discover the attack signals raised by the WPDDoS attacks which work on the application layer. |
2. In Fig.1 the attack traffic and the normal traffic are forwarded to the victim server by the hierarchical proxy network. Thus the victim server can judge which proxy‘s outgoing traffic includes the attack behaviour, but cannot accurately block the attack traffic via its source IP [4][5].This increases the false positive rate (FPR) for most of the existing detection system. |
A Robust Defense Scheme for the Detection of Distributed Denial of Service Attack through Web Proxy System |
3. The attack traffic and the normal traffic continue to be aggregated by each web proxy that they pass through. Hence it is not easy for the detection system to discover the real attack pattern. Combined with the botnets, WPDDoS may be more aggressive. |
These issues not only make the DDoS attacks easier but also increase the attack performance. The proposed scheme achieves the goal via behavior analysis. It assumes that the time varying aggregated traffic sent by a particular web proxy and observed by the victim server is controlled by a series of underlying behavioral patterns of the web proxy[11][12].Transition between two consecutive behavior patterns represents that the driving mechanism of the aggregated traffic is changing. Then a mathematical model namely HMM is applied which describes the behavior characteristics of the web proxies. |
II.PROPOSED SYSTEM |
Hidden Markov Model is used for the detection of traffic in the Distributed Denial Service of service attacks. To present the effectiveness of the proposed method the work is confined with the following conditions. |
1. It deals with the flooding of the HTTP request from the attacker to the web proxy. |
2. The web proxy forwards all the requests to the web server. The web server forwards or rejects the request based on the recessive attributed namely normal or abnormal behavior.System design of the proposed technique is depicted in Fig: 2The diagram shows how the proxy to server traffic is generated and also about the decision of the web server whether to accept or reject the incoming request. |
The above diagram illustrates the following steps. |
Step 1: A general mathematical model is developed to describe the traffic behavior of the proxy servers. |
Step 2: An incoming aggregated traffic is evaluated by a given behavior model. |
Step 3: The judgment index is the judgment criteria obtained from the HMM model based on the probability obtained from the accepted web pages. |
Step 4: The web server returns the response to the web proxy based on the detection result. |
B. Behavior Model |
Proxy‘s access behavior can be regarded as the combination of external manifestations and the intrinsic driving mechanism. The external manifestation includes the temporal and spatial locality. Here a temporal locality is utilized to build the behavior model. |
Hence in order to obtain the statistics of the most recently accessed page, a least recently used (LRU) stack concept is used. It converts the reference stream into the stack distance stream and returns the stack distance from the top of the stack as the output. |
In the proposed method the spoofed requests is identified from a number of genuine requests. This is done by generating a number of requests from the user to the web server. If the any consecutive two requests occur within the time interval of 10ns then the particular string is considered to be abnormal and it is not inserted into the stack. Similarly if the same string occurs for more than once within a small time interval then that particular string is also considered to be abnormal. The probabilities of the accepted and the rejected strings are noted. The probability of the False Positive Rate is also calculated. |
C. Stack Distance Model for Temporal Locality |
Temporal locality refers to the property that referencing behavior in the recent past is a good predictor of the referencing behavior to be seen in the near future. The stack object references is a good model for characterizing the behavior of the proxies and cache. The main advantage of the stack distance model for describing the Web proxies access behavior. |
Here a request string can be converted into a distance string that preserves the pattern of the activity. To define this notion, the files are assumed to be placed on a stack such that, whenever the file f is requested, it is either pulled from its position in the stack and placed on the top, or it is simply added to the stack if it is not in it. |
Thus, starting with an empty stack, the reference stream is Fi={f1; f2 ;…….fi} where fi denotes the name of the ith requested file. Index i indicates that i requests have already arrived at a server. Thus, the unit of time is one request, an incoming request represents a new event occurring. The least recently used (LRU)stack is denoted by Li, which is an ordering of all files of a server by recent usage. Thus, at index i, the LRU stack is given by Li={u1; u2; ……..uN}, where u1; u2;……..uNare files of the serve rand u1 is the most recently accessed file, u2 the next most recently. In other words, u1 is just accessed at index i, (i.e) fi = u1. Whenever a reference is made to a file, the stack must be updated. Considering that fi+1 =ujthen the stack becomesLi+1 = { u1; u2; …….. uN}. Suppose now that Li-1={ u1; u2;….. uN}and fi = uj, i.e., the request ri is at distance j in stack Li-1. |
Let di denote the stack depth of the document referenced at index i. Then a new relation can be obtained by the following equation iffi =uj then di = j, where j denotes the stack depth of the requested document at index i. Thus, the reference trace{f1; f2; …..;fi} defines a numerical sequence {d1; d2; …….; di} of trace distances. Fig .3gives a possible state of the stack before and after the request for document âÃâ¬ÃââÃâ¬ÃâA‘‘ occurs. The stack distance for this access is 3. Repeating the above transformation for each access, the object reference steam can be translated to the stack distance stream. A reference symbol stream {A,D,C,A,B,D,E,A,B} is translated into the numerical stack distance stream {3,4,3,3,5,4,5,4,4}.Here popular files will tend to experience much shorter stack distances than rare files. |
D. Hidden Semi MarkovModel |
The basic structure of the HSMM is illustrated in Fig.4. A HsMM consists of a pair of discrete-time stochastic processes Otand Xt ,t€{1,2,….. t} where t is the index of observation (also called event). The Ot is the observed (or output) process and may either be discrete or continuous, univariate or multivariate henceXtis the finite-state hidden semi-Markov chain. |
Here {Xt} is not observable directly through{ Ot} but can be estimated .To model a web proxy‘s access behavior by an HsMM, each hidden semi-markov state represents a driving mechanism of the proxy-to-server traffic. The transition between two different markov states represents the changes of the driving mechanism. Duration of a particular semi-markov state represents the dwell time of its corresponding driving mechanism. Here two driving mechanisms are defined, namely normal and abnormal behavior. |
E. Hidden Semi MarkovModel |
The stack distance reveals the frequency about the access of the particular string from the stack. Here the strings represents the number of URL. The probability about the number of strings accepted or rejected is obtained with the stack concept. This is given as a input to the HsMM. Through the modeling of the HsMM, the final probability about the acceptance and rejection of the string is obtained. From this the proxy-to-server traffic can be analyzed which represents the recessive attributes of the particular string (i.e) whether it is normal or abnormal. False Positive Rate is also obtained by comparing both the designed HsMM model with the arrival rate based method. |
F. Web Server Model |
The stack distance reveals the frequency about the access of the particular string from the stack. Here the strings represents the number of URL. The probability about the number of strings accepted or rejected is obtained with the stack concept. This is given as a input to the HsMM. Through the modeling of the HsMM, the final probability about the acceptance and rejection of the string is obtained. From this the proxy-to-server traffic can be analyzed which represents the recessive attributes of the particular string (i.e) whether it is normal or abnormal. False Positive Rate is also obtained by comparing both the designed HsMM model with the arrival rate based method. |
III.IMPLEMENTATION |
The implementation methodology of Distributed denial of service attack through web proxy system is discussed here. |
A. Behavior Module |
The User submits the request for the required web page. The request is transferred from the client to the web server through the web proxy system. The proxy in turn then originates a new HTTP request and sends it to the web server in such a way that the requestor is not known. The web server then decides whether to serve the web page based on certain strategy [6][7][8]. |
B. Hidden Semi Markov Module |
This is implemented with the tool jahmm (An implementation of Hidden Markov Models in java. This class demonstrates how to build a HMM with known parameters, how to generate a sequence of observations with the given HMM, how to learn the parameters of a HMM given observation sequences, how to compute the probability of an observation sequence for a HMM[9],[10]. It uses the computer network that can experience jamming. When the wireless medium is jammed, a lot of web pages are lost. Thus, the HMMs built here have two states (congested/not congested). |
The output probability of whether the string is accepted or rejected is obtained from the behavior model for the number of different requests. This probability is given as the initial value for the HMM. The transition between two different states in HMM is likely to accept or reject the string. The string here represents the URL of the requestor. The HMM returns the value after the observation sequence. From this the detection performance is compared with the arrival rate based scheme. |
C. Web Server Module |
Modeling the varying process of a web proxy‘s recessive attribute can profile the proxy‘s real behavior better than the dominant attributes. The dominant attributes can be directly observed. It includes arrival rate, temporal locality, packet size and so on. Recessive attributes cannot be directly observed from the proxy-to-server traffic (e.g) the type(normal or abnormal).The main challenge of such a model is that the recessive attributes are unobservable to the victim server. |
Here a web proxy is regarded as an invisible state machine whose state sequence represents the varying process of the proxy‘s recessive attributes. Since all recessive attributes are unobservable to the victim server, the state sequence can only be estimated via the observable dominant attributes of the proxy-to-server traffic. Hence after the probability value obtained from the HMM, the proxy decides whether to accept or reject the string based on normal or abnormal behavior. The following algorithms are used for evaluating the traffic behavior. This algorithm is used for evaluating whether the traffic is normal or abnormal. |
Algorithm 1: |
D. Performance Analysis |
The acceptance rate of the incoming request in the proposed scheme is compared with the commonly used arrival rate based method. |
In this paper we use âÃâ¬Ãâ¢normalâÃâ¬Ãâ to denote the proxy to server traffic without malicious behavior, use âÃâ¬Ãâ¢pollutedâÃâ¬Ãâ to denote the mixed traffic of normal and attack request. In Fig.5 The above curve denotes the normal curve of the proxy to server traffic with the accepted incoming files. Here the acceptance rate of the incoming request in the proposed scheme is compared with the commonly used arrival rate based method and is found to be better than the detection based on the arrival rate. |
The values in the x-axis denote the number of incoming request with respect to respect to the memory. The values in the x-axis denote the probability of the acceptance rate. |
In Fig.6 False Positive rate (FPR) of the arrival rate method is compared with the HMM based technique based on the arrival rate of the incoming request. |
The proposed scheme is based on proxy behavior instead of the traffic volume. It does not depend on traffic intensity but only compares the proxy‘s current behavior with the normal behavior profile. Although traffic volume are not sutaible for the detection when attacks are based on low traffic, the existence of attack requests will distort a proxies access behavior. This enables the proposed scheme to achieve detection. The detection becomes easier when proxies access behavior is distorted seriously by the heavy attack traffic. |
The proposed scheme is compared with detection based on arrival rate and the performance of the HMM based behavior is found to be better than the existing system. |
IV.CONCLUSIONS |
The proposed method focuses on a new latent attack that exploits the communication mechanism of proxy servers to achieve the web proxy based distributed denial of service attacks. The temporal locality is used to extract the access behavior of the web proxy. From the depth of the stack the recent usage of the files is obtained. Based on time deviation between the present requested file and the previous requested file the normality of the string is detected. |
An improved hidden semi markov model is proposed. It demonstrates how to build the HMM and the observation sequence with the known parameters. The output probability obtained from the behavior model is given as the input to the initial states of the HMM. The transition between two different states in HMM represent that the incoming request is likely to accept or reject. Hence the web proxy‘s access behavior is modelled by mapping it to the HMM. The recessive attributes which is not directly observed is obtained from the observed proxy-to-server traffic. The driving mechanism of whether the traffic is normal or abnormal is also obtained from the incoming request. |
The acceptance rate of the incoming request in the proposed scheme is compared with thecommonly used arrival rate based method and the performance of HMM based technique is found to be better than the detection based on the arrival rate. |
False Positive rate with respect to the number of the incoming request is identified and the performance of the HMM based technique is found to be better than the detection based on the arrival rate. |
As a result the issue about the False Positive rate and the Acceptance rate is addressed. |
In future, the proposed method will be implemented in a real communication platform to test other new strategy . We propose to investigate other attacks on web proxy based distributed denial of service. |
References |
|