This is based directly from David Moore’s research, University of California.
Internet researchers at the University of California, San Diego, performed an investigation of current DoS activity in the Internet to determine the magnitude of damage it was doing against all Internet-connected hosts, and discover any trends or re-occurring patterns of attacks. DoS attacks aim to make computer resources or services unavailable to its intended users. They come in two classes. One class is logical attacks; where software flaws in remote servers are exploited causing them to either crash or struggle in performance. The other class is flooding attacks; where the victim’s CPU, memory and/or network resources are overwhelmed by sending large amounts of fake requests. The study focuses on the latter class.
Flooding attacks overwhelm the victims network by sending lots of small packets as fast as possible as opposed to larger packets because network devices are not limited by bandwidth but by the rate at which packets can be processed. They also overwhelm the victims CPU/memory by sending types of packets that require processing above the network layer (hence require the operating system to process the packet as it is passed up the system protocol stack). The best known DoS flooding attack is the “SYN Flood”. DoS can be distributed (DDoS) so that more powerful attacks can take place; where innocent hosts can become infected with a small attack daemon. These daemons effectively run the flood attacks and can be controlled such that thousands of these infected hosts can (obliviously) focus their attacks on a single victim. “IP spoofing” is used to conceal the source of the flood attacks, where the source IP can be set randomly per packet or set to a single victims’ address for reflector attacks.
The researchers had to develop a way of obtaining a representative sample of DoS attacks in the Internet themselves; firstly because service providers and content providers considered monitoring of such data is sensitive and private, secondly because even if they monitored the data from service / content providers it would be extremely difficult to get a representative sample of the complete Internet address-space. They obtained their representative sample by monitoring what they call “backscatter”. Backscatter is the existence of reply packets that were never requested for; backscatter sorts genuine request messages from the fakes (since replies to genuine requests have destinations that actually made the request). Their main challenge was to develop a method to gather backscatter data from a representative sample of the Internet.
Backscatter packets were sampled by their source IP addresses. Their chosen method introduced biases to the collected sample due to three assumptions. The first and most debasing assumption was that all flood packets spoofed IP addresses were assigned randomly (with a uniform distribution). Some ISP’s block packets which the source address is outside the address space of their customers networks, these packets are hence excluded from the sample. Reflect attacks have the same source IP for all fake-packets; the specific source IP may not fall within the sample address space which would completely exclude all the packets from the sample involved in the reflector attack. The second assumption was that all malicious packets are delivered reliably to the victim and backscatter to the monitor; which is incorrect since the IP is a best-effort service; excluding corrupt, dropped or lost packets from the sample. The third assumption was that all backscatter could include packets that were mis-interpreted as packets resulting from a DoS attack (for example random port scans). The effects on the results of the third assumption however was considered neglectable.
Three traces of backscatter were recorded for an approximate period of 7 days per trace to form the representative DoS attack sample from the Internet. A portion of 1/256 of the total Internet address space was sampled. The backscatter data was post-processed into two different forms of sample data that was used for analyzing two different classes of attacks. One form of sample data was for analyzing “Flow-based” attacks. These attacks are defined as a series of consecutive packets sharing the same target IP address and IP protocol. Therefore conclusions/statistical summaries can draw upon information such as attack durations, targeting of attacks or types of protocols (and their variations) used in attacks. The other form of sample data was for “Event-based” attacks. These attacks are defined as a small time-frame (of 1 minute) where at least 10 backscatter packets were recorded. Therefore conclusions/statistical summaries can draw upon the intensity of attacks.
Summery of Results from Event-Based Attack Data
0.3% of the attacks where the random source IP addresses where evenly distributed and 2.4% of all attacks could overpower firewalls highly equipped to resist these DoS attacks. The majority of the attacks recorded would be intense enough (where the packet rate is high enough) to overwhelm commodity solutions, and a small fraction are intense enough to overwhelm high-end solutions to these DoS attacks.
Summery of Results from Flow-Based Attack Data
50% of the attacks and 20% of the backscatter packets were TCP packets with the RST flag set. Most of these attacks where focused on multiple ports rather than a specific single port.
50% of attacks occur for less than 10 minutes. 80% are less than 30 minutes, and 90% are less than an hour.
A significant fraction of attacks are directed against home users. 2-3% of attacks target name servers. 1-3% of attacks target routers.
Commercial attacks are aimed not just at larger sites (like hotmail.com) but also at a wide range of small to medium sized businesses.
Approximately 15% of attacks are targeted at the net TDL and another approximate 15% of attacks are targeted at the com TDL.
65% of victims were only attacked once.