This is based directly from Brice Augustin, Timur Friedman, and Renata Teixeira research (2007)
Internet paths are becoming more and more complex due to the increase of load balancers; often there are multiple Internet paths between a source and destination. This paper devises a methodology to discover the possible paths between a source and destination and to classify the type of load-balanced paths that are detected.
A tool was developed to expose load-balanced paths using a probabilistic algorithm called the Multipath Detection Algorithm (MDA). They extended Paris traceroute to not only just identify a single path (through per-flow load balancers), but to identify as many paths as possible. To achieve this Paris traceroute was extended to use MDA. MDA works by disproving a hypothesis at each hop that their are two successor interfaces. It does this by sending six probes with a variation of destination addresses (using the same prefix so that the probes are most likely forwarded to the same destination) and a TTL that exceeds at the next hop so that if the target-hop happened to be a per-packet, per-flow or per-destination load-balancer, the probes would most likely report the existence of at least two different succeeding interfaces. If the hypotheses is confirmed then the algorithm stops, otherwise the hypothesis then is modified to assume that the target hop now load-balances to three paths instead of two. This keeps on going until all succeeding interfaces are found. six probes for disproving the existence of two succeeding interfaces yields a 95% level of confidence, the amount of probes increase as the algorithm incrementally increases the amount of interfaces it aims to disprove to maintain this degree of confidence (e.g. some load balancers may have as many as 16 load-balanced interfaces which would require 96 probes to ensure a 95% level of confidence). The algorithm also classifies the type of load balancer whenever one is detected. When it discovers that there are at least two succeeding interfaces at a specific hop, another 6 probes are sent just as before except that none of the flow identifier fields are changed – if multiple succeeding interfaces are reported then it is classified as a per-packet load balancer. Otherwise it sends yet another 6 probes except this time it changes flow-identifier fields but keeps the destination fields constant – if multiple succeeding interfaces are reported then it is classified as a per-flow load balancer. Otherwise it classified a per-destination load balancer.
Measurements were taken from 15 sources. 13 of which were RON nodes, and 2 of which were host machines; one in Paris and another in Romania. The sources were geographically spread out reasonably evenly across the US and Europe. Two destination lists were used; one called the MIT list which consisted of 68,629 addresses including the source addresses themselves; these destinations consisted of a mix of end-hosts and routers/middleboxes. The other destination list, denoted the WEB list, consisted of 500 of the most popular websites (at the time). They used UDP probes for all their measurements instead of ICMP or TCP probes because load balancers do not always load-balance ICMP packets and TCP probes might have been blocked by IDS’s during the measurements.
Per-destination load balancers were found in 70% of the source-destination pairs. Per-flow load balancers were found in 39% of the source-destination pairs. The fraction of per-destination and per-flow load balanced paths varied a lot over the different sources; indicating that these load-balancers were close to the sources. Only 1.9% of the source-destination pairs traversed through a per-packet load balancer. The small fraction of per-packet load balanced paths remained consistent for each source; indicating that the load-balancers responsible were generally close to the destinations.
Load-balanced paths can be contained within an AS (intra-load balancing) or between ASes (inter-load balancing). Although BGP (The Internet’s inter-domain routing protocol) does not allow multiple next hops per prefix, some vendors are now adding support for such functionality. The load-balanced detected in their measurements were classified by using a public IP-to-AS mapping service. They excluded the divergence and convergence hops (for each diamond) in the traces since it is very common for interfaces at the boundary of ASes to use an IP Address that belongs to another ASes address space. The divergence point is excluded because it might be at the entry point of an AS where the load-balancers interfaces address is from another AS; hence it would incorrectly indicate inter-AS load balancing. The convergence point is excluded because Paris traceroute outputs an IP-level map of the network so it cannot know whether or not the router-level convergence point actually occurs before the IP convergence point (e.g. the hop before the IP convergence point for each path might be different interfaces that belongs to the same router). They only classified the traces that were measured from the Paris host to the MIT-list destinations (because is a long and tedious task since it involves manual verification). They found that most diamonds are created by intra-domain load balancing. 86% of the per-flow diamonds were contained within a single AS.
Diamond widths generally were found to be narrow. 55% of the paths that contained diamonds created be per-flow load balancers only had two link-disjoint paths (for the minimum width), for per-destination this figure was 67% and for per-packet this was 60%. 99% of the paths that contained diamonds created be per-flow load balancers only had at most five link-disjoint paths (for the minimum width), for per-destination this figure was 98% and for per-packet this was 90%. They found that wide diamonds containing from 10 up to16 ink-disjoint paths were mostly two hops long. One explanation for this is that the diamonds detected could be the result of load-balancing between two routers over parallel links. Such wide diamonds however were found to be rare.
Generally diamonds were found to be of a short length (the largest hop-count between the divergence point and convergence point of a diamond). 35% of the paths that contained diamonds created by per-flow load balancers had a length of two. For per-destination this was 26% and for per-packet this was 90%. For longer diamonds they observed lengths of up to 15 for per-flow diamonds, 17 for per-destination diamonds and only 6 for per-packet diamonds. It was found that there was a relationship between diamond lengths and widths such that diamonds tend to be short and narrow, but never really long and wide.
They found that diamonds tend to be symmetric. It was very rare for diamonds to have an asymmetric above three. For per-flow diamonds with a high-degree of asymmetry the diamonds at the router-level may not have been asymmetric but instead could have been due to one path being an MPLS tunnel and another not being an MPLS tunnel (this was confirmed by routers within the diamond including MPLS extensions in the ICMP responses to the Paris traceroute probes).
They investigated the differences of delays between the paths within the diamonds. They were only able to analyse diamonds created by per-flow load balancers as the delay differences were calculated from RTTs therefore the probes must consistently take the same forward and return path to present useful data. Paris traceroutes original implementation only maintained the forward paths flow-identifiers but did not maintain the return-path flow identifiers. They therefore extended Paris traceroute to maintain the return-path identifiers. This was done by selecting flow-identifiers that not only maintain the forward path but also happened to have their ICMP responses compute a checksum that the per-flow load balancers would use to forward it on the same router-path.
In order to measure the RTTs they had to perform another measurement separate to the previous measurements using the extended traceroute. The destinations were the first 5000 reachable destinations that traversed through a per-flow load balancer (known from first measurements taken). For each destination they strategically selected at most six flow identifies that would take six different paths in the target per-flow diamonds. The RTTs were calculated by subtracting the time when the ICMP response was received from the time in which the UDP probe (that invoked the ICMP response) was sent. They computed the RTT differences between the shortest and longest RTTs of the paths found (per diamond) and analysed these results.
They found that only 12% of the observed source-destination pairs (for the delay measurements) had a delay difference of more than 2ms. 59% of the pairs could have saved at least 2% on delay time (this improvement is with respect to the longest delay time). Only 2% could save at least 10% on the delay time (once again, this improvement is with respect to the longest delay time). It was found that there was a correlation between delay differences and lengths of diamonds; 94% of the paths with short diamonds had paths with no RTT differences, as the diamond length increases more delay differences arise. It was also found that there was a correlation between delay differences and the asymmetric property of diamonds; 90% of the symmetric diamonds had no delay differences, this portion generally dropped off as the diamonds asymmetry increased.