Entry for the 2008 Humie award for Human Competitive Results

Title: Immune-inspired Network Intrusion Detection System (i-NIDS)


(1) the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result,

M. Zubair Shafiq, Syed Ali Khayam, Muddassar Farooq
Improving the Accuracy of Immune-inspired Malware Detectors by using Intelligent Features

M. Zubair Shafiq, Muddassar Farooq, Syed Ali Khayam
A Comparative Study of Fuzzy Inference Systems, Neural Networks and Adaptive Neuro Fuzzy Inference Systems for Portscan Detection


(2) the name, complete physical mailing address, e-mail address, and phone
number of EACH author of EACH paper,

Name: M. Zubair Shafiq
Physical address:
Next Generation Intelligent Neworks Research Center
National University of Computer & Emerging Sciences
A.K. Brohi Road, Sector H-11/4, Islamabad, Pakistan
Email: zubair.shafiq@nexginrc.org
Tel: +92 51 111 128 128 (Ext. 190)

Name: Syed Ali Khayam
Physical address:
AB-2, Room 19
School of Electrical Engineering and Computer Science
Formerly NUST Institute of Information Technology
National University of Sciences & Technology
Chaklala Scheme III, Rawalpindi, Pakistan
Email: khayam@niit.edu.pk
Tel : +92 51 395 912

Name: Muddassar Farooq 
Physical address: 
Next Generation Intelligent Networks Research Center 
National University of Computer & Emerging Sciences 
A.K. Brohi Road, Sector H-11/4, Islamabad, Pakistan 
Email: muddassar.farooq@nexginrc.org 
Tel: +92 51 111 128 128 (Ext. 206)


(3) the name of the corresponding author (i.e., the author to whom notices will
be sent concerning the competition),

M. Zubair Shafiq ( zubair.shafiq@nexginrc.org )


(4) the abstract of the paper(s),

Title:
A Comparative Study of Fuzzy Inference Systems, Neural Networks and Adaptive Neuro
Fuzzy Inference Systems for Portscan Detection
Abstract:
Worms spread by scanning for vulnerable hosts across the Internet.
In this paper we report a comparative study of three classification
schemes for automated portscan detection. These schemes include a simple
Fuzzy Inference System (FIS) that uses classical inductive learning, a
Neural Network that uses back propagation algorithm and an Adaptive
Neuro Fuzzy Inference System (ANFIS) that also employs back propagation
algorithm. We carry out an unbiased evaluation of these schemes
using an endpoint based traffic dataset. Our results show that ANFIS
(though more complex) successfully combines the benefits of the classical
FIS and Neural Network to achieve the best classification accuracy.

Title:
Improving Accuracy of Immune-inspired Malware Detectors by using Intelligent Features
Abstract:
In this paper, we show that a Bio-inspired classifier's accuracy can be
dramatically improved if it operates on intelligent features. We propose
a novel set of intelligent features for the well-known problem of malware portscan
detection. We compare the performance of three well-known Bio-inspired classifiers
operating on the proposed intelligent features: (1) Real Valued Negative Selection
(RVNS) based on the adaptive immune system; (2) Dendritic Cell Algorithm (DCA)
based on the innate immune system; and (3) Adaptive Neuro Fuzzy Inference System
(ANFIS). To empirically evaluate the improvements provided by the intelligent
features, we use a network traffic dataset collected on diverse endpoints for a
period of 12 months. The endpoints' traffic is infected with well-known malware.
For unbiased performance comparison, we also include a machine learning algorithm,
Support Vector Machine (SVM), and two state-of-the-art statistical malware detectors,
Rate-Limiting (RL) and Maximum-Entropy (ME). To the best of our knowledge, this is
the first study in which RVNS and DCA are not only compared with each other but also
with several other classifiers on a comprehensive real-world dataset. The experimental
results indicate that our proposed features significantly improve the TP rate and FP
rate of both RVNS and DCA.


(5) a list containing one or more of the eight letters (A, B, C, D, E, F, G, or
H) that correspond to the criteria (see above) that the author claims that the
work satisfies,

A, D, E, F, G


(6) a statement stating why the result satisfies the criteria that the
contestant claims (see the examples below as a guide to aid in constructing this
part of the submission),

Network/computer security has emerged as a rapidly evolving and
subtle field in the past decade. A recent study by CERT [1] has
shown that the estimated damage due to attacks by network/computer
malware (or worms) have soared up to a magnitude of more than 100
billion dollars just in the year 2007. This significant increase in
losses is attributable to: (1) the increasing degree due to which
Internet has become a part of the global business infrastructure,
(2) the increase in sophistication and speed of zero-day (previously
unseen) malware, which can propagate almost to every host connected
to Internet just within minutes. Such an amazing propagation speed
of computer malware (worms) rules out the possibility of any human
countermeasures against them, and consequently it emphasizes the
need of automated intrusion detection/prevention systems. The field
of network/computer security has seen the design and development of
several automated anomaly-based statistical malware detectors [6,7], 
which analyze statistics of users' network traffic. However, 
state-of-the-art malware detectors do not have the desired accuracy, 
which leaves the network administrators with the options of reactive 
countermeasures and postmortem analysis of traffic logs.

Bio-inspired community also undertook significant amount of research
to provide Bio-inspired solutions to this daunting problem.
Specifically, the field of Artificial Immune Systems (AISs) has
emerged which takes inspiration from working principles of
biological immune system. Biological immune system does not require
an priori knowledge about harmful bacteria and viruses to protect an
organism from them. Researchers in the field of artificial immune
systems have been trying to replicate immunology to develop an
automated yet intelligent intrusion detection system which has the
ability to detect zero-day attacks. Several intrusion detection
systems such as LISYS (light weight intrusion detection system)
based on the principles of adaptive immune system and DCA (dendritic
cell algorithm) based on the principles of innate immune system have
been developed [4,5].

Bio-inspired intrusion detection systems, developed to date, are
never evaluated on real-world traffic sets (obtained from real
Internet). Moreover, they are also never compared with
state-of-the-art statistical malware detectors, and as a result,
networking community still questions their merit in real Internet.
To achieve this goal, we have spent more than 12 months of dedicated
effort for collection of network traffic on a diverse set of
endpoints including home, office and university computers. This
network traffic dataset was collected at WAVES lab, Michigan State
University [2]. Our evaluations have shown that the classification
accuracy of state-of-the-art Bio-inspired malware detectors is
significantly poor as compared to state-of-the-art statistical
anomaly detectors. But even this relatively better accuracy of
statistical anomaly detectors is still significantly low that makes
them of little value for deployment in real world Internet.

Our analysis showed that the poor accuracy of Bio-inspired malware
detectors on our real world dataset is attributable to the use of
naive traffic features. To solve this problem, we proposed 4 novel
intelligent features: (1) burstiness of session arrivals which
captures burstiness in users' traffic behavior, (2) multi-resolution
session rates which quantifies the effect of traffic volume at
multiple time resolutions, (3) entropy of destination IP addresses
which captures the spreading behavior of users' traffic, and (4)
divergence in port distributions which captures the difference in
port histograms on port-by-port basis. Our results show that these
features significantly improved the classification accuracy of
Bio-inspired malware detectors, and consequently they outperformed
state-of-the-art statistical detectors with a significant margin in
terms of the classification accuracy.

We followed an engineering vision during designing of our
intelligent features framework that helped us in realizing our
proposed anomaly detection framework in real Internet. We have
carefully chosen the computational complexity and memory
requirements for all proposed features. Consequently, we have
successfully realized our proposed system (i-NIDS) in the kernel of
MS Windows. We are also planning to release our engineered system
(i-NIDS) for MS Windows under a suitable public license at our
website [3]. The expected release will either be at GECCO 2008 or
right after GECCO. We also intend to provide a detailed
demonstration showing the merits and strengths of our i-NIDS at
GECCO.


[1] Carnegie Mellon University's Computer Emergency Response Team
(CERT), http://www.cert.org/

[2] Wireless and Video Communications Lab, Michigan State
University, http://www.egr.msu.edu/waves/

[3] Next Generation Intelligent Networks Research Center, National
University of Computer & Emerging Sciences, Islamabad, Pakistan,
http://www.nexginrc.org/

[4] Aickelin Uwe and Greensmith Julie, Sensing Danger: Innate
Immunology for Intrusion Detection, Elsevier Information Security,
2007.

[5] F. Gonzalez D. Dasgupta, Anomaly Detection Using Real-Valued
Negative Selection. Journal of Genetic Programming and Evolvable
Machines, Volume 4, Issue 4. December, 383-403, 2003.

[6] Y. Gu, A. McCullum, and D. Towsley, Detecting anomalies in
network traffic using maximum entropy estimation, USENIX/ACM
Internet Measurement Conference, Berkeley, U.S.A, 2005.

[7] J. Twycross and M. M. Williamson, Implementing and testing a
virus throttle, Usenix Security Symposium, Washington DC, U.S.A,
2003.


(7) a full citation of the paper (that is, author names; publication date; name
of journal, conference, technical report, thesis, book, or book chapter; name of
editors, if applicable, of the journal or edited book; publisher name; publisher
city; page numbers, if applicable);

M. Zubair Shafiq, Syed Ali Khayam, Muddassar Farooq, "Improving the
Accuracy of Immune-inspired Malware Detectors by using Intelligent
Features", Genetic and Evolutionary Conference (GECCO), July, 2008,
Atlanta, USA. (In Press)

M. Zubair Shafiq, Muddassar Farooq, "A Comparative Study of Fuzzy
Inference Systems, Neural Networks and Adaptive Neuro Fuzzy
Inference Systems for Portscan Detection", In M. Giacobini et al.
(Eds.), Proceedings of Applications of Evolutionary Computing,
EvoWorkshops 2007 (EvoCoMnet), Volume 4974 of Lecture Notes in
Computer Science, pp. 48–57, Springer Verlag, Napoli, Italy, March,
2008. (BEST PAPER NOMINATION)

(8) a statement either that "any prize money, if any, is to be divided equally
among the co-authors" OR a specific percentage breakdown as to how the prize
money, if any, is to be divided among the co-authors.

Any prize money, if any, will be divided equally among the co-authors.

(9) a statement stating why the judges should consider the entry as "best" in
comparison to other entries that may also be "human-competitive."

The importance of an automated intrusion detection system can be
hardly over-emphasized given the possibility of massive losses in
revenue. State-of-the-art intrusion detection systems (both
statistical and Bio-inspired) require significant performance
improvement before they can be deployed in real Internet. Our
proposed i-NIDS significantly outperforms state-of-the-art
statistical and Bio-inspired malware detectors in terms of
classification accuracy. We followed an engineering vision during
our research phase that helped us in engineering i-NIDS in the
kernel of MS Windows. The proposed system is undergoing extensive
testing at our research center and is expected to be released under
a suitable public license in the summer of 2008. We believe that our
system will automatically provide robust protection of Internet
against zero-day attacks with minimum or no user intervention. As a
result, it will make extremely difficult for the hackers to launch
stealthy zero-day attacks on Internet causing billions of dollars of
loss in revenue. Our research project has already attracted a US$250,000
funding from National ICT R&D Fund of Ministry of IT, Government of
Pakistan (http://www.ictrdf.org.pk/fp-aisgpids.htm).