Data Mining for Network Intrusion Detection Vipin Kumar Army High

37 Slides4.67 MB

Data Mining for Network Intrusion Detection Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota http://www.cs.umn.edu/ kumar Project Participants: V. Kumar, A. Lazarevic, J. Srivastava P. Dokas, E. Eilertson, L. Ertoz, S. Iyer, S. Ketkar, P. Tan Research supported by AHPCRC/ARL

Cyber Threat Analysis As the cost of information processing and Internet accessibility falls, organizations are becoming increasingly vulnerable to potential cyber threats such as network intrusions Incide nts Re porte d to Compute r Eme rge ncy Re sponse Te am/Coordination Ce nte r (CERT/CC) 60000 50000 40000 30000 20000 10000 0 90 91 92 93 94 95 96 97 98 99 Intrusions are actions that attempt to bypass security mechanisms of computer systems Intrusions are caused by: Attackers accessing the system from Internet Insider attackers - authorized users attempting to gain and misuse non-authorized privileges 00 01

Intrusion Detection Intrusion Detection System combination of software and hardware that attempts to perform intrusion detection raises the alarm when possible intrusion happens Traditional intrusion detection system IDS tools (e.g. SNORT) are based on signatures of known attacks Limitations Signature database has to be manually revised for each new type of discovered intrusion www.snort.o They cannot detect emerging cyber threats rg Substantial latency in deployment of newly created signatures across the computer system

Data Mining for Intrusion Detection Increased interest in data mining based IDS for detection Attacks for which it is difficult to build signatures Unforeseen/Unknown attacks Emerging Threats Data mining approaches for intrusion detection Misuse detection Building predictive models from labeled labeled data sets (instances are labeled as “normal” or “intrusive”) Can only detect known attacks and their variations High accuracy in detecting many kinds of known attacks Anomaly detection Able to detect novel attacks as deviations from “normal” behavior Potential high false alarm rate - previously unseen (yet legitimate) system behaviors may also be recognized as anomalies

Misuse Detection Classification of intrusions RIPPER [Madam ID @ Columbia U], Bayesian classifier [ADAM @ George Mason U], fuzzy association rules [Bridges00], decision trees [ARL U Texas, Sinclair99], neural networks [Lippmann00, Ghosh99, Canady98], genetic algorithms [Bridges00, Sinclair99] Association pattern analysis Building normal profile [Barbara01, Manganaris99], frequent episodes for constructing features [Madam ID @ Columbia U] Cost sensitive modeling AdaCost [Fan99], MetaCost [Domingos99], [Ting00], [Karakoulas95] Learning from rare class [Kubat97, Fawcett97, Ling98, Provost01, Japkowicz01, Chawla01, Joshi01]

Anomaly Detection Statistical approaches Finite mixture model [Yamanishi00], 2 based [Ye01] Various anomaly detection Temporal sequence learning [Lane98], neural networks [Ryan98], similarity tree [Kokkinaki97], generating artificial anomalies [Fan01], Clustering [Madam ID, Eskin02], unsupervised SVM [Madam ID, Eskin02], Outlier detection schemes Nearest neighbor approaches [Knorr98, Jin01, Ramaswamy00, Aggarwal01], Density based [Breunig00], connectivity based [Tang01],Clustering based [Yu99]

Key Technical Challenges Large data size Millions of network connections are common for commercial network sites, High dimensionality Hundreds of dimensions are possible Temporal nature of the data Data points close in time - highly correlated Skewed class distribution “Mining needle in a haystack. So much hay and so little time” Interesting events are very rare looking for the “needle in a haystack” Data Preprocessing Converting network traffic into data High Performance Computing (HPC) can be critical for on-line analysis and scalability to very large data sets

MINDS Project - Recent Accomplishments MINDS – MINnesota INtrusion Detection System Learning from Rare Class – Building rare class prediction models Anomaly/outlier detection Summarization of attacks using association pattern analysis

MINDS - Learning from Rare Class Problem: Building models for rare network attacks (Mining needle in a haystack) Standard data mining models are not suitable for rare classes Models must be able to handle skewed class distributions Learning from data streams - intrusions are sequences of events Key results: PNrule and related work [Joshi, Agarwal, Kumar, SIAM 2001, SIGMOD 2001, ICDM 2001, KDD 2002] SMOTEBoost algorithm [Lazarevic, in review] CREDOS algorithm [Joshi, Kumar, in review] Classification based on association - add frequent items as “meta-features” to original data set

MINDS - Anomaly and Outlier Approach Detection Detecting novel attacks/intrusions by identifying them as deviations from “normal” behavior Goals: Construct useful set of features for data mining algorithms Identify novel intrusions using outlier detection schemes Distance based techniques Nearest neighbor approach Mahalanobis-distance approach Clustering based approaches Density based schemes Unsupervised Support Vector Machines (SVM)

Experimental Evaluation Publicly available data set DARPA 1998 Intrusion Detection Evaluation Data Set Real network data from University of Minnesota Open source signaturebased network IDS www.snort.o networ k 2 millions connections net-flow data using CISCO routers MINDS Data preprocessing 4 times a day 10 rg 10 minutes cycle Anomaly detection is applied anomaly detection minutes time window Anomal y scores Associatio n pattern analysis

DARPA 1998 Data Set DARPA 1998 data set (prepared and managed by MIT Lincoln Lab) includes a wide variety of intrusions simulated in a military network environment 9 weeks of raw TCP dump data 7 weeks for training (5 million connection records) 2 weeks for training (2 million connection records) Connections are labeled as normal or attacks (4 main categories of attacks - 38 attack types) DOS - Denial Of Service Probe - e.g. port scanning U2R - unauthorized access to gain root privileges, R2L - unauthorized remote login to machine, Two types of attacks Bursty attacks - involve multiple network connections Non-bursty attacks - involve single network connections

Feature construction Three groups of features Basic features of individual TCP connections: source & destination IP/port, protocol, number of bytes, duration, number of packets (used in SNORT only in stream builder) Time based features For the same source (destination) IP address, number of unique destination (source) IP addresses inside the network in last T seconds Number of connections from source (destination) IP to the same destination (source) port in last T seconds Connection based features For the same source (destination) IP address, number of unique destination (source) IP addresses inside the network in last N connections Number of connections from source (destination) IP to the same destination (source) port in last N connections

MINDS Outlier Detection on DARPA’98 Data ROC Curves for different outlier detection techniques ROC Curves for different outlier detection techniques 1 1 0.9 0.9 Detection Rate 0.7 0.6 0.5 ROC curves for bursty attacks 0.4 Unsupervised SVM LOF approach Mahalanobis approach NN approach 0.3 0.2 0.1 0 0.02 0.04 0.06 0.08 False Alarm Rate 0.1 0.12 Detection Rate 0.8 0.8 0.7 0.6 0.5 0.4 0.3 LOF approach NN approach Mahalanobis approach Unsupervised SVM 0.2 0.1 0 0 0.02 0.04 0.06 False Alarm Rate 0.08 0.1 LOF approach is consistently better than other approaches ROC curves for single-connection attacks Unsupervised SVMs are good but only for high false alarm (FA) rate LOF approach is superior to other outlier detection schemes NN approach is comparable to LOF for low FA rates, but detection rate decrease for high FA Mahalanobis-distance approach – poor due to multimodal normal behavior Majority of single connection attacks are probably located close to the dense regions of the normal data

Outlier Detection Recent Results ( on DARPA’98 data) Analyzing multi-connection attacks using the score values assigned to network connections Detection rate is measured through number of connections that have score higher than 0.5 1 0.9 Low peaks due to occasional “reset” value for the feature called “connection status” Connection score 0.8 0.7 0.6 0.5 0.4 LOF approach 0.3 0.2 NN aproach Mahalanobis-distance based approach 0.1 0 0 10 20 30 40 50 60 Number of connections 70 80 90 100

Recently Detected Real-life Attacks During the past few months various intrusive/suspicious activities were detected at the AHPCRC and at the U of Minnesota using MINDS A sample of top ranked anomalies/attacks picked by MINDS August 13, 2002 Detected scanning for Microsoft DS service on port 445/TCP (Ranked #1) Reported by CERT as recent DoS attacks that needs further analysis (CERT August 9, 2002) Undetected by SNORT since the scanning was non-sequential (very slow) Number of scanning activities on Microsoft DS service on port 445/TCP reported in the World (Source www.incidents.org)

Recently Detected Real-life Attacks (ctd) A sample of top ranked anomalies/attacks picked by MINDS August 13, 2002 Detected scanning for Oracle server (Ranked #2) Reported by CERT, June 13, 2002 First detection of this attack type by our University Undetected by SNORT because the scanning was hidden within another Web scanning August 8, 2002 Identified machine that was running Microsoft PPTP VPN server on non-standard ports, which is a policy violation (Ranked #1) Undetected by SNORT since the collected GRE traffic was part of the normal traffic October 30, 2002 Identified compromised machines that were running FTP servers on non-standard ports, which is a policy violation (Ranked #1) Anomaly detection identified this due to huge file transfer on a non-standard port Undetectable by SNORT due to the fact there are no signatures for these activities

Recently Detected Real-life Attacks (ctd) A sample of top ranked anomalies/attacks picked by MINDS October 10, 2002 Detected several instances of slapper worm that were not identified by SNORT since they were variations of existing warm code Deteted by MINDS anomaly detection algorithm since source and destination ports are the same but non-standard, and slow scan-like behavior for the source port Potentially detectable by SNORT using more general rules, but the false alarm rate will be too high Number of slapper worms on port 2002 reported in the World (Source www.incidents.org)

Recently Detected Real-life Attacks (ctd) Top ranked anomalies/attacks picked by MINDS October 10, 200 Detected a distributed windows networking scan from two different source locations (Ranked #1) Similar distributed scan from 100 machines scattered around the World happened at University of Auckland, New Zealand, on August 8, 2002 and it was reported by CERT, Insecure.org and other security organizations Attack source s Destination IPs Distributed scanning activity

SNORT vs. MINDS Anomaly/Outlier SNORT has static knowledge manually updated by human analysts MINDS anomaly/outlier detection algorithms are adaptive in nature include infinite number of rules MINDS anomaly/outlier detection algorithms san also be effective in detecting anomalous behavior originating from a compromised machine

SNORT vs. MINDS Anomaly/Outlier Content-based attacks (e.g. content of the packet) SNORT is able to detect only those attacks with known signatures Out of scope for MINDS anomaly/detection algorithms, since they do not use the content of the packets Scanning activities Same source sequential destination scans SNORT is better than MINDS anomaly/outlier detection in identifying these attacks, since it is specifically designed for their detection Scans with random destinations MINDS anomaly/outlier detection algorithms discover them quicker than SNORT since SNORT has to increase time window (specifies the scanning threshold) which results in the large memory requirements Slow scans MINDS anomaly/outlier detection identifies them better than SNORT, since SNORT has to increase time window which increases processing requirements

SNORT vs. MINDS Anomaly/Outlier Policy violations (e.g. rogue and unauthorized services) MINDS anomaly/outlier detection algorithms are successful in detecting policy violations, since they are looking for unusual and suspicious network behavior To detect these attacks SNORT has to have a rule for each specific unauthorized activity, which causes increase in the number of rules and therefore the memory requirements

MINDS - Framework for Mining Associations Anomaly Detection System Ranked connections attack Discriminating Association Pattern Generator normal update 1. Build normal profile 2. Study changes in normal behavior 3. Knowledge Base Create attack summary 4. Detect misuse behavior 5. Understand nature of the attack R1: TCP, DstPort 1863 Attack R100: TCP, DstPort 80 Normal

Discovered Real-life Association Patterns Rule 1: SrcIP XXXX, DstPort 80, Protocol TCP, Flag SYN, NoPackets: 3, NoBytes:120 180 (c1 256, c2 1) Rule 2: SrcIP XXXX, DstIP YYYY, DstPort 80, Protocol TCP, Flag SYN, NoPackets: 3, NoBytes: 120 180 (c1 177, c2 0) At first glance, Rule 1 appears to describe a Web scan Rule 2 indicates an attack on a specific machine Both rules together indicate that a scan is performed first, followed by an attack on a specific machine identified as vulnerable by the attacker

Discovered Real-life Association Patterns (ctd) DstIP ZZZZ, DstPort 8888, Protocol TCP (c1 369, c2 0) DstIP ZZZZ, DstPort 8888, Protocol TCP, Flag SYN (c1 291, c2 0) This pattern indicates an anomalously high number of TCP connections on port 8888 involving machine ZZZZ Follow-up analysis of connections covered by the pattern indicates that this could be a machine running a variation of the Kazaa file-sharing protocol Having an unauthorized application increases the vulnerability of the system

Discovered Real-life Association Patterns (ctd) SrcIP XXXX, DstPort 27374, Protocol TCP, Flag SYN, NoPackets 4, NoBytes 189 200 (c1 582, c2 2) SrcIP XXXX, DstPort 12345, NoPackets 4, NoBytes 189 200 (c1 580, c2 3) SrcIP YYYY, DstPort 27374, Protocol TCP, Flag SYN, NoPackets 3, NoBytes 144 (c1 694, c2 3) This pattern indicates a large number of scans on ports 27374 (which is a signature for the SubSeven worm) and 12345 (which is a signature for NetBus worm) Further analysis showed that no fewer than five machines scanning for one or both of these ports in any time window

Discovered Real-life Association Patterns (ctd) DstPort 6667, Protocol TCP (c1 254, c2 1) This pattern indicates an unusually large number of connections on port 6667 detected by the anomaly detector Port 6667 is where IRC (Internet Relay Chat) is typically run Further analysis reveals that there are many small packets from/to various IRC servers around the world Although IRC traffic is not unusual, the fact that it is flagged as anomalous is interesting This might indicate that the IRC server has been taken down (by a DOS attack for example) or it is a rogue IRC server (it could be involved in some hacking activity)

Discovered Real-life Association Patterns (ctd) DstPort 1863, Protocol TCP, Flag 0, NoPackets 1, NoBytes 139 (c1 498, c2 6) DstPort 1863, Protocol TCP, Flag 0 (c1 587, c2 6) DstPort 1863, Protocol TCP (c1 606, c2 8) This pattern indicates a large number of anomalous TCP connections on port 1863 Further analysis reveals that the remote IP block is owned by Hotmail Flag 0 is unusual for TCP traffic

Conclusions Rare class predictive models improve the detection of infrequent attack types MINDS anomaly/outlier detection algorithms are successful in detection of intrusions that could not be picked by commercial “state of the art” IDS tools (SNORT) Slow scans and random scans Policy violations and unauthorized activities MINDS association patterns can be useful in creating summaries of detected attacks and suggesting new signatures

Future Work On-line detection algorithms Better characterization of “normal” behavior Detection of distributed attacks Insider attacks Other applications of anomaly detection Credit card fraud detection Insurance fraud detection Transient fault detection for industrial process control Detecting individuals with rare medical syndromes (e.g. cardiac arrhythmia)

Questions?

Distance based Outlier Detection Schemes Nearest Neighbor (NN) approach For each point compute the distance to the k-th nearest neighbor dk Outliers are points that have larger distance dk and therefore are located in the more sparse neighborhoods Mahalanobis-distance based approach Mahalanobis distance is more appropriate for computing distances with skewed distributions y’ x’ * * * * * * p2 * * * * * * * * * * * * * * * * p1 Back

Density based Outlier Detection Schemes Local Outlier Factor (LOF) approach For each point compute the density of local neighborhood Compute LOF of example p as the average of the ratios of the density of example p and the density of its nearest neighbors Outliers are points with the largest LOF value In the NN approach, p2 is not considered as outlier, while the LOF approach find both p1 and p2 as outliers p2 p1 Back

Unsupervised Support Vector Machines for Outlier Detection Unsupervised SVMs attempt to separate the entire set of training data from the origin, i.e. to find a small region where most of the data lies and label data points in this region as one class Parameters Expected number of outliers Variance of rbf kernel As the variance of the rbf kernel gets smaller, the separating surface gets more complex origin push the hyper plane away from origin as much as possible Back

SNORT signature based Network IDS SNORT (www.snort.org) is an open source Network Intrusion Detection System (IDS) based on signatures SNORT contains anomaly detector SPADE (Statistical Packet Anomaly Detection Engine) usually turned off due to high false alarm rate SNORT may be configured in one of the following modes sniffer mode – reads the packets from the network and displays them for you in a continuous stream on the console packet logger mode – logs the packet to the disk intrusion detection mode - analyzes network traffic for matches against a user defined rule set and perform several actions based upon what it sees. Back

SPADE – SNORT Anomaly Detection SPADE is a SNORT preprocessor plugin which sends alerts of anomalous packet through standard SNORT reporting mechanisms (the fewer times that a particular kind of packet has occurred in the past, the higher its anomaly score will be) It is a part of SPICE (Stealthy Probing and Intrusion Correlation Engine) project at www.silicondefense.com SPICE consists of two parts: SPADE that act as an anomaly sensor engine and report anomalous events to event correlator event correlator that groups these events together and send out reports of unusual activity (e.g., portscans) Back

Recently detected real-life attacks http://www.cert.org/current/current activity.html#Microsoft-DS Microsoft-DS (445/tcp) Activity updated August 9 added August 9 “We have received reports of widespread scanning and possible denial of service activity targeted at the Microsoft-DS service on port 445/tcp. We are interested in receiving reports of this activity from sites with detailed logs and evidence of an attack. Please send all reports to [email protected]” Back

Back to top button