V) Analysis of Simulation Results

I performed one simulation run for each topology mentioned shown in Appendix IV, save the torus which I did not have time to implement. The simulations with the previously noted packet injection rates for random traffic until average of 500 words of traffic per node were generated for 3600 node systems and 500 words per node for 2048 node systems. At the average packet length 2.2 words per of packet this translates 930K packets and 819K packets, respectively.
The simulation was fairly successful although the topologies with many tight cycles such as the SCC and de Bruijn entered the region of network saturation, where more packets were produced than were consumed. Although this greatly slowed simulation, it allows the data to more strikingly illustrate areas of bottlenecking in a congested system. Each simulation ranged from approximately 20 to 5 minutes of execution on a PentiumŪ system with a 75 MHz memory bus and a 266 MHz internal clock. The simulator seems fairly fast stable, however, I have experienced a great amount of difficultly making the stand-alone plotter tool which interfaces with the plplot X11 plotting utility.
Data collected at a packet level by the simulator includes: the number of hops the packet has made and the number of cycles the packet has spent waiting for a route. When packets reach their destination they are logged and these numbers are added to accumulators which can then be divided by the number of packets routed to find average wait and average hop for random packets in the simulated topology at the simulated traffic level. The average latency in a topology can therefore be though of a some constant cycle length times the average number of hop cycles per packet plus average number of wait cycles per packet. Appendix V shows a graph comparing the resulting average wait and average hop values in each topology under random traffic with the previously mentioned simulation conditions. Data in the simulator is also collected at the nodes for each link in the topology. Since each link keeps track of the number cycles it routed (and prohibits routing of more words per cycle than the link width), three link state counters are maintained: b_cycles, indicating the number of cycles the link was completely busy, i_cycles, indicating the number of cycles the link was idle, and p_cycles, indicating the number of partial (neither busy nor idle) cycles for the link. Also collected at a per link level is null_wds, a total for the number of null words (empty slots) that a link has transmitted.
In a balanced indirect MPP topology under heavy traffic it would be desirable for all links to experience approximately the same traffic[2], thus no money is wasted on link which could perform at some lower bandwidth. To test this against the simulation data the per link null words counter was read for each link and made into a distribution of how many links in the topology experience a certain level of null words. One of these distributions was produced for each topology simulated, then selected distributions were aligned back-to-back in a 3d mesh plot for analysis. This plot is show in Appendix VI along with some analysis.