ISSN ONLINE(2278-8875) PRINT (2320-3765)
Mr. Suresh kashyap1 ,Ms. Pooja Agrawal2, Mr.Vikas Chandra Pandey3, Mr. Suraj Prasad Keshri4
|
Related article at Pubmed, Scholar Google |
Visit for more related articles at International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering
An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system. Soft computing techniques resemble biological processes more closely than traditional techniques, which are largely based on formal logical systems. Knowledge Discovery in Databases (KDD) is the automated discovery of patterns and relationships in large databases. In this paper we are going to preprocess the different of KDD cup 99 data set. The two algorithms Error back propagation (EBP) which is the most used training algorithm for feedforwrd artificial neural networks (FFANNs) and the Radial basis function (RBF) neural network which is based on supervised learning are compared .After the process we give result that Radial basis function (RBF) is better than Error back propagation (EBP) .For comparison we used MATLAB tool.
Keywords |
Detection methods, Matlab, intrusion detection, network security. |
INTRODUCTION |
An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system. IDS' initial design and function is to protect the organization's vital information from an outsider. The IDS analyzes the information it gathers and compares it to large databases of attack signatures. |
Intrusion detection functions include:- |
ïÃâ÷ Monitoring and analyzing both user and system activities. |
ïÃâ÷ Analyzing system configurations and vulnerabilities. |
ïÃâ÷ Assessing system and file integrity. |
ïÃâ÷ Ability to recognize patterns typical of attacks. |
ïÃâ÷ Analysis of abnormal activity patterns. |
ïÃâ÷ Tracking user policy violations. |
II. IDS WITH TRADITIONAL APPROACH |
The increasing complexity of modern computing systems makes traditional views of information security impractical, if not impossible. Computing environments are dynamic with near constant changes in configurations, software, and usage patterns. This makes completely securing a given system a difficult theoretical task for static Systems unfeasible for the dynamic nature of today’s systems. This presents the need for a more dynamic view of information security, one that recognizes the insufficiency of static descriptions of policy and security mechanisms and that proposes a dynamic means of providing security which is sufficient for a given system at a given time. |
Many draw back has in Traditional Approach: |
ïÃâ÷ Signature-based IDSs must be programmed to detect each attack and thus must be constantly updated with signatures of new attacks. |
ïÃâ÷ Many signature-based IDSs have narrowly defined signatures that prevent them from detecting variants of common attacks. |
ïÃâ÷ Anomaly detection approaches usually produce a large number of false alarms due to the unpredictable nature of users and networks. |
ïÃâ÷ Anomaly detection approaches often require extensive “training sets” of system event records in order to characterize normal behavior patterns |
ïÃâ÷ Application-based IDSs may be more vulnerable than host-based IDSs to being attacked and disabled since they run as an application on the host they are monitoring. |
III.SOFT COMPUTING |
Soft Computing became a formal Computer Science area of study in the early 1990. Earlier computational approaches could model and precisely analyze only relatively simple systems. More complex systems arising in biology, medicine, the humanities, management sciences, and similar fields often remained intractable to conventional mathematical and analytical methods |
Components of soft computing include:- |
ïÃâ÷ Neural networks (NN). |
ïÃâ÷ Fuzzy systems (FS). |
ïÃâ÷ Evolutionary computation (EC). |
ïÃâ÷ Evolutionary algorithms. |
A. Why Soft Computing Tools Used For IDS ? |
Traditional protection techniques such as user authentication, data encryption, avoiding programming errors and firewalls are used as the first line of defense for computer security. If a password is weak and is compromised, user authentication can not prevent unauthorized use, firewalls are vulnerable to errors in configuration and suspect to ambiguous or undefined security policies. They are generally unable to protect against malicious mobile code, insider attacks and unsecured modems. Programming errors cannot be avoided as the complexity of the system and application software is evolving rapidly leaving behind some exploitable weaknesses. Consequently, computer systems are likely to remain unsecured for the foreseeable future. Intrusion detection is useful not only in detecting successful intrusions, but also in monitoring attempts to break security, which provides important information for timely countermeasures. |
An Intrusion Detection System (IDS) itself can be defined as the tools, methods, and resources to help identify, assess, and report unauthorized or unapproved network activity. |
IV.KDD 99 DATA SET |
Knowledge Discovery in Databases (KDD) is the automated discovery of patterns and relationships in large databases. Large databases are not uncommon. Cheaper and larger computer storage capabilities have contributed to the proliferation of such databases in a wide range of fields. |
KDD employs methods from various fields such as machine learning, artificial intelligence, pattern recognition, database management and design, statistics, expert systems, and data visualization. KDD has been more formally defined as the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.The KDD Process is a highly iterative, user involved, multistep process, as can be seen in figure 2. |
We see that initially, we have organizational data. This data is the operational data gathered either in one or several locations. |
V.CLASSIFICATION OF DATASET |
The all dataset classifying the five broad on based of attacks categories these are: |
ïÃâ÷ Normal dataset: The normal data set class means the IDS cannot detect any abnormal condition. |
ïÃâ÷ DoS (Denial Of Service dataset): Denial of Service (DoS) is a class of attack where an attacker makes a computing or memory resource too busy or too full to handle legitimate requests, thus denying legitimate users access to a machine. |
ïÃâ÷ R2L(Unauthorized Access from a Remote Machine dataset):A remote to user (R2L) attack is a class of attack where an attacker sends packets to a machine over a network, then exploits the machine’s vulnerability to illegally gain local access as a user. There are different types of R2U attacks; the most common attack in this class is done using social engineering. |
ïÃâ÷ U2Su (Unauthorized Access to Local Super User (root) dataset ): User to root exploits are a class of attacks where an attacker starts out with access to a normal user account on the system and is able to exploit vulnerability to gain root access to the system. Most common exploits in this class of attacks are regular buffer overflows, which are caused by regular programming mistakes and environment assumptions. |
ïÃâ÷ Probing (Surveillance and Other Probing dataset): Probing is a class of attack where an attacker scans a network to gather information or find known vulnerabilities. An attacker with a map of machines and services that are available on a network can use the information to look for exploits. There are different types of probes: some of them abuse the computer’s legitimate features; some of them use social engineering techniques. |
For training the KDD cup 99 data set we have given number to different types attack including normal attack as shown in table. |
VI.CLASSIFICATION |
Data classification is a methodology to align business requirements to infrastructure, so that infrastructure service delivery properly supports data storage and management. |
Classification of objects is probably one of the most common and ancient decision tasks performed by humans. It can be seen as the ability of assigning a specific object to a predefined group or class based on a number of observed attributes of that object. The classification process was primarily related to our natural senses: humans recognize or classify objects based on the data acquired by their natural sensors. The data collected by the sensors is converted to specific features. |
Above Figure 3 show the Schematic view of the Classification process. |
VII.CLASSIFICATION THROUGH EBPA |
One of the most popular weight updating rules of learning (training) algorithms is Error Back Propagation(EBP).However, most of the EBP based neural learning algorithms strictly depends on the architecture of the ANN and there are many problems associated with the currently existing algorithm based on EBP and its variation. Feed- Forward NN with EBP learning method are a very multi-purpose system. They can be seen as a statistical method, a nonlinear controller, a filter, an agent behavior system and every other complex input-output function approximation and generalization. |
Algorithm: |
Training a neural net by back-propagation involves three stages: |
• Feed-forward of input training pattern, |
• Back-propagation of associated error, and |
• Adjustment of weights. |
The algorithm is as follows: |
Step 1: Initialize the weights (set to random values). |
Step 2: While stopping condition is false, do steps 2-9. |
Step 3: For each training pair, do steps 3-8. |
VIII.CLASSIFICATION THROUGH RADIAL BASIS FUNCTION |
Radial basis function (RBF) neural network is based on supervised learning. RBF networks were independently proposed by many researchers and are a popular alternative to the MLP. RBF networks are also good at modelling nonlinear data and can be trained in one stage rather than using an iterative process as in MLP and also learn the given application quickly. They are useful in solving problems where the input data are corrupted with additive noise. |
Training of RBF neural networks:- |
training set is an m labelled pair {Xi, di} that represents associations of a given mapping or samples of a continuous multivariate function. The sum of squared error criterion function can be considered as an error function E to be minimized over the given training set. That is, to develop a training method that minimizes E by adaptively updating the free parameters of the RBF network. These parameters are the receptive field centres μj of the hidden layer Gaussian units, the receptive field widths σj, and the out-put layer weights (w ij ). Because of the differentiable nature of the RBF network transfer characteristics, one of the training methods considered here was a fully supervised gradient-descent method over E. |
In particular, μj σjand w ij are updated as follows: |
where ρ μ, ρσ and ρ W , are small positive constants. This method is capable of matching or exceeding the performance of neural networks with back-propagation algorithm, butgives training comparable with those of sigmoidal type of FFNN14. The training of the RBF network is radically different from the classical training of standard FFNNs. In this case, there is no changing of weights with the use of the gradient method aimed at function minimization. In RBF networks with the chosen type of radial basis function, training resolves itself into selecting the centres and dimensions of the functions and calculating the weights of the output neuron. Now simulate IDS data through MATLAB s/w using EBPA and RBN then we getting the following result |
IX.COMPARISON |
As we have trained our Neural network using EBP Algorithm and RBF and we are getting different output .It is clear from above two figurer but for IDS data (training) RBF is working well while EBP has shown less efficient result. |
We have tried and trained our Neural network for very less amount of data , this may be the reason why me are getting error full result . for getting error less result we can perform following task . |
1) Initializing better weight of connection of Neural Network . |
2) Setting another parameter like bias neuron . |
3) Considering more number of training data of IDS . |
X.CONCLUSION AND FURTHER RESEARCH |
This paper consist the training of Neural Network specially designed for IDS data lots of works has been done in this field number of soft computing based tools were designed for Intrution detection, This paper is an effort towards simulation of IDS data for developing intelligent system for IDS. Our result show that this approach of developing IDS can be enhanced by using different technique an discussed in comparison parts. |
Further research in this field can be carried out by considering different algorithm for training neural network with more amount of data and to compare and conclude me result that which algorithm will be suitable for IDS data, further a new soft computing tool can be designed for IDS system using hybrid technology of soft computing . |
References |
|