[To appear in the Network Analysis Times, July 2000.]
One of WCI's recent efforts has been the development of throughput tests and network diagnostics using the AMP constellation. Every DREN AMP box (referred to as damp-xxx) runs netperf and testrig servers and can thus be used as targets for user tests. Testrig is configured to offer 750 KB windows, so it should be suitable for high speed TCP flows over reasonable delays. Periodic treno and mping tests are performed, usually early on Saturday or Sunday mornings. Summaries are built showing the n^2 matrix of results between seven DREN sites. Mping in particular has proven to be useful in revealing problems over wide area paths.
![]() |
![]() |
| Figure 1 | Figure 2 |
The above pictures show selected mping "thumbnails" from the weekly diagnostic tests. In all cases 1000 byte packets were used, and the number of packets in flight was varied from 1 to 500. Packets per second (pps) vs. the number in flight (window size) is show, with green being the pps transmitted and red the pps received. The graphs are named as src-dst, i.e. source and destination node names, and normally displayed in a full nxn matrix so that anomalies are quickly visible. [More recently, we have switched to using 1250 byte packets, because at 10000 bits per packet it is easy to read off both packets per second and bits per second from the Y axis.]
Figure 1 shows fairly normal (or desired) behavior with a well defined knee and a long stable queueing region, while figure 2, the reverse path, shows increasing round trip time (1/slope) with load. These two together show that the packet forwarding behavior in opposite directions of the same path may be quite different. As load increases to the right (more packets in flight) eventually some element in the network becomes overloaded and packets are discarded (separation between the green and red lines). TCP throughput tests such as treno or testrig usually report data rates near or slightly below this point where significant loss begins.
![]() |
![]() |
| Figure 3 | Figure 4 |
Figure 3 shows a relatively common situation. Here there is little stable queueing (something has insufficient buffering) and some drop off in performance during discard (red line does not remain flat). Some device is taking time to throw packets away (perhaps logging a message). The periodic spikes indicate some regular event such as a cisco route-cache cleaner, which might be avoided via CEF, etc.
Figure 4 is occurring whenever we cross from the Continental US to/from Hawaii. On that path, OC3 ATM hits a DS3 ATM circuit, and thus has to be throttled down. The throughput oscillates in a repeatable way with very little packet loss. We believe this is the affect of an ATM rate shaper beating against the changing offered load, but have yet to pinpoint the proper course of action to improve it.
![]() |
![]() |
| Figure 5 | Figure 6 |
Figures 5 and 6 show high loss situations. The first, where loss is independent of offered load, could indicate low level bit errors on a link. The second resulted from an ethernet duplex problem (auto-negotiation wasn't working properly). These often show a characteristic hump rather than a knee. Our experience is showing that duplex problems may be far more common than people realize. Since the network still "works", and low rate pings report no loss, they often go unnoticed. It is only under load that the problem becomes clear.
More examples of mping and what it can show you can be found in these TCP/IP and Network Performance Tuning slides. Links to treno and testrig can be found on our Tools page.
Phil Dykstra phil@wareonearth.com http://sd.wareonearth.com/