Entry for the 2008 Humie award for Human Competitive Results Title: Immune-inspired Network Intrusion Detection System (i-NIDS) (1) the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result, M. Zubair Shafiq, Syed Ali Khayam, Muddassar Farooq Improving the Accuracy of Immune-inspired Malware Detectors by using Intelligent Features M. Zubair Shafiq, Muddassar Farooq, Syed Ali Khayam A Comparative Study of Fuzzy Inference Systems, Neural Networks and Adaptive Neuro Fuzzy Inference Systems for Portscan Detection (2) the name, complete physical mailing address, e-mail address, and phone number of EACH author of EACH paper, Name: M. Zubair Shafiq Physical address: Next Generation Intelligent Neworks Research Center National University of Computer & Emerging Sciences A.K. Brohi Road, Sector H-11/4, Islamabad, Pakistan Email: zubair.shafiq@nexginrc.org Tel: +92 51 111 128 128 (Ext. 190) Name: Syed Ali Khayam Physical address: AB-2, Room 19 School of Electrical Engineering and Computer Science Formerly NUST Institute of Information Technology National University of Sciences & Technology Chaklala Scheme III, Rawalpindi, Pakistan Email: khayam@niit.edu.pk Tel : +92 51 395 912 Name: Muddassar Farooq Physical address: Next Generation Intelligent Networks Research Center National University of Computer & Emerging Sciences A.K. Brohi Road, Sector H-11/4, Islamabad, Pakistan Email: muddassar.farooq@nexginrc.org Tel: +92 51 111 128 128 (Ext. 206) (3) the name of the corresponding author (i.e., the author to whom notices will be sent concerning the competition), M. Zubair Shafiq ( zubair.shafiq@nexginrc.org ) (4) the abstract of the paper(s), Title: A Comparative Study of Fuzzy Inference Systems, Neural Networks and Adaptive Neuro Fuzzy Inference Systems for Portscan Detection Abstract: Worms spread by scanning for vulnerable hosts across the Internet. In this paper we report a comparative study of three classification schemes for automated portscan detection. These schemes include a simple Fuzzy Inference System (FIS) that uses classical inductive learning, a Neural Network that uses back propagation algorithm and an Adaptive Neuro Fuzzy Inference System (ANFIS) that also employs back propagation algorithm. We carry out an unbiased evaluation of these schemes using an endpoint based traffic dataset. Our results show that ANFIS (though more complex) successfully combines the benefits of the classical FIS and Neural Network to achieve the best classification accuracy. Title: Improving Accuracy of Immune-inspired Malware Detectors by using Intelligent Features Abstract: In this paper, we show that a Bio-inspired classifier's accuracy can be dramatically improved if it operates on intelligent features. We propose a novel set of intelligent features for the well-known problem of malware portscan detection. We compare the performance of three well-known Bio-inspired classifiers operating on the proposed intelligent features: (1) Real Valued Negative Selection (RVNS) based on the adaptive immune system; (2) Dendritic Cell Algorithm (DCA) based on the innate immune system; and (3) Adaptive Neuro Fuzzy Inference System (ANFIS). To empirically evaluate the improvements provided by the intelligent features, we use a network traffic dataset collected on diverse endpoints for a period of 12 months. The endpoints' traffic is infected with well-known malware. For unbiased performance comparison, we also include a machine learning algorithm, Support Vector Machine (SVM), and two state-of-the-art statistical malware detectors, Rate-Limiting (RL) and Maximum-Entropy (ME). To the best of our knowledge, this is the first study in which RVNS and DCA are not only compared with each other but also with several other classifiers on a comprehensive real-world dataset. The experimental results indicate that our proposed features significantly improve the TP rate and FP rate of both RVNS and DCA. (5) a list containing one or more of the eight letters (A, B, C, D, E, F, G, or H) that correspond to the criteria (see above) that the author claims that the work satisfies, A, D, E, F, G (6) a statement stating why the result satisfies the criteria that the contestant claims (see the examples below as a guide to aid in constructing this part of the submission), Network/computer security has emerged as a rapidly evolving and subtle field in the past decade. A recent study by CERT [1] has shown that the estimated damage due to attacks by network/computer malware (or worms) have soared up to a magnitude of more than 100 billion dollars just in the year 2007. This significant increase in losses is attributable to: (1) the increasing degree due to which Internet has become a part of the global business infrastructure, (2) the increase in sophistication and speed of zero-day (previously unseen) malware, which can propagate almost to every host connected to Internet just within minutes. Such an amazing propagation speed of computer malware (worms) rules out the possibility of any human countermeasures against them, and consequently it emphasizes the need of automated intrusion detection/prevention systems. The field of network/computer security has seen the design and development of several automated anomaly-based statistical malware detectors [6,7], which analyze statistics of users' network traffic. However, state-of-the-art malware detectors do not have the desired accuracy, which leaves the network administrators with the options of reactive countermeasures and postmortem analysis of traffic logs. Bio-inspired community also undertook significant amount of research to provide Bio-inspired solutions to this daunting problem. Specifically, the field of Artificial Immune Systems (AISs) has emerged which takes inspiration from working principles of biological immune system. Biological immune system does not require an priori knowledge about harmful bacteria and viruses to protect an organism from them. Researchers in the field of artificial immune systems have been trying to replicate immunology to develop an automated yet intelligent intrusion detection system which has the ability to detect zero-day attacks. Several intrusion detection systems such as LISYS (light weight intrusion detection system) based on the principles of adaptive immune system and DCA (dendritic cell algorithm) based on the principles of innate immune system have been developed [4,5]. Bio-inspired intrusion detection systems, developed to date, are never evaluated on real-world traffic sets (obtained from real Internet). Moreover, they are also never compared with state-of-the-art statistical malware detectors, and as a result, networking community still questions their merit in real Internet. To achieve this goal, we have spent more than 12 months of dedicated effort for collection of network traffic on a diverse set of endpoints including home, office and university computers. This network traffic dataset was collected at WAVES lab, Michigan State University [2]. Our evaluations have shown that the classification accuracy of state-of-the-art Bio-inspired malware detectors is significantly poor as compared to state-of-the-art statistical anomaly detectors. But even this relatively better accuracy of statistical anomaly detectors is still significantly low that makes them of little value for deployment in real world Internet. Our analysis showed that the poor accuracy of Bio-inspired malware detectors on our real world dataset is attributable to the use of naive traffic features. To solve this problem, we proposed 4 novel intelligent features: (1) burstiness of session arrivals which captures burstiness in users' traffic behavior, (2) multi-resolution session rates which quantifies the effect of traffic volume at multiple time resolutions, (3) entropy of destination IP addresses which captures the spreading behavior of users' traffic, and (4) divergence in port distributions which captures the difference in port histograms on port-by-port basis. Our results show that these features significantly improved the classification accuracy of Bio-inspired malware detectors, and consequently they outperformed state-of-the-art statistical detectors with a significant margin in terms of the classification accuracy. We followed an engineering vision during designing of our intelligent features framework that helped us in realizing our proposed anomaly detection framework in real Internet. We have carefully chosen the computational complexity and memory requirements for all proposed features. Consequently, we have successfully realized our proposed system (i-NIDS) in the kernel of MS Windows. We are also planning to release our engineered system (i-NIDS) for MS Windows under a suitable public license at our website [3]. The expected release will either be at GECCO 2008 or right after GECCO. We also intend to provide a detailed demonstration showing the merits and strengths of our i-NIDS at GECCO. [1] Carnegie Mellon University's Computer Emergency Response Team (CERT), http://www.cert.org/ [2] Wireless and Video Communications Lab, Michigan State University, http://www.egr.msu.edu/waves/ [3] Next Generation Intelligent Networks Research Center, National University of Computer & Emerging Sciences, Islamabad, Pakistan, http://www.nexginrc.org/ [4] Aickelin Uwe and Greensmith Julie, Sensing Danger: Innate Immunology for Intrusion Detection, Elsevier Information Security, 2007. [5] F. Gonzalez D. Dasgupta, Anomaly Detection Using Real-Valued Negative Selection. Journal of Genetic Programming and Evolvable Machines, Volume 4, Issue 4. December, 383-403, 2003. [6] Y. Gu, A. McCullum, and D. Towsley, Detecting anomalies in network traffic using maximum entropy estimation, USENIX/ACM Internet Measurement Conference, Berkeley, U.S.A, 2005. [7] J. Twycross and M. M. Williamson, Implementing and testing a virus throttle, Usenix Security Symposium, Washington DC, U.S.A, 2003. (7) a full citation of the paper (that is, author names; publication date; name of journal, conference, technical report, thesis, book, or book chapter; name of editors, if applicable, of the journal or edited book; publisher name; publisher city; page numbers, if applicable); M. Zubair Shafiq, Syed Ali Khayam, Muddassar Farooq, "Improving the Accuracy of Immune-inspired Malware Detectors by using Intelligent Features", Genetic and Evolutionary Conference (GECCO), July, 2008, Atlanta, USA. (In Press) M. Zubair Shafiq, Muddassar Farooq, "A Comparative Study of Fuzzy Inference Systems, Neural Networks and Adaptive Neuro Fuzzy Inference Systems for Portscan Detection", In M. Giacobini et al. (Eds.), Proceedings of Applications of Evolutionary Computing, EvoWorkshops 2007 (EvoCoMnet), Volume 4974 of Lecture Notes in Computer Science, pp. 48–57, Springer Verlag, Napoli, Italy, March, 2008. (BEST PAPER NOMINATION) (8) a statement either that "any prize money, if any, is to be divided equally among the co-authors" OR a specific percentage breakdown as to how the prize money, if any, is to be divided among the co-authors. Any prize money, if any, will be divided equally among the co-authors. (9) a statement stating why the judges should consider the entry as "best" in comparison to other entries that may also be "human-competitive." The importance of an automated intrusion detection system can be hardly over-emphasized given the possibility of massive losses in revenue. State-of-the-art intrusion detection systems (both statistical and Bio-inspired) require significant performance improvement before they can be deployed in real Internet. Our proposed i-NIDS significantly outperforms state-of-the-art statistical and Bio-inspired malware detectors in terms of classification accuracy. We followed an engineering vision during our research phase that helped us in engineering i-NIDS in the kernel of MS Windows. The proposed system is undergoing extensive testing at our research center and is expected to be released under a suitable public license in the summer of 2008. We believe that our system will automatically provide robust protection of Internet against zero-day attacks with minimum or no user intervention. As a result, it will make extremely difficult for the hackers to launch stealthy zero-day attacks on Internet causing billions of dollars of loss in revenue. Our research project has already attracted a US$250,000 funding from National ICT R&D Fund of Ministry of IT, Government of Pakistan (http://www.ictrdf.org.pk/fp-aisgpids.htm).