II) Selection of Topologies
In simulating IN topologies I had the goal of examining cost effective connectivity for MPP
machines in the not so distant future. According to an IN design adage, the cost of an IN per node
is relatively proportional to the degree, or number of IN links, of those nodes. With future MPP
machine sizes approaching thousands of nodes, the hypercube, a log-time favorite topology, becomes
exceedingly expensive, with an expensive node degree of eleven for a system of 2048 nodes.
Therefore this simulation will examine topologies that connect a similar number of nodes at a
degree cost at or less than eleven, thus making the hypercube the high cost standard for which each
of these cheaper topologies are compared against.
It is also desirable to test only a balanced or full topology since all nodes will have a
uniform number of links and the full node address space allotted is used. Therefore all traffic
patterns and link bottlenecks possible between any two addressable nodes within the tested
topology a can be observed by the simulation. The balanced topological size of 2048 nodes was
chosen for simulating topologies using a binary node addressing system, thus those systems have
eleven node address bits (2^11 = 2048). This system size of 2048, besides being feasible for an
MPP system, is also near in magnitude to the chosen system size of 3600 nodes for systems using
unique symbol node addressing, such as the SCC topology.
III) Overview of Topologies Simulated
The star connected cycles (SCC) topology by , is a variation of the star network which is
based on a the concept of unique symbol addressing. Each node in a star(n) network has an
address composed of n unique symbols. A connection is specified from any node in a star
network to any other node whose n symbol address can be achieved by swapping the first
symbol of the address with any of the latter symbols. As an example in a star(4) topology
node is connected to node, node , and node. The size of a star(n)
network is n! nodes.
The SCC extends this topology by specifying a ring of size n-1 to occur at each
star(n) node address so that a sub or lateral network is formed to route a new additional
symbol of lateral network address. This specifies a new address format of n unique symbols
plus one additional non unique cycle position symbol ranging from 2 to n. Each node on a SCC
cycle has a link to the address attained by swapping the first address symbol with the symbol of
its cycle position, and two lateral cycle links. For example in an SCC(4), as shown in Appendix I, a node[1234:2] (where the address is specified in network
address : ring address format) has a connection to node[2134:2], as well as node[1234:4] and
node[1234:3]. Node[1234:1] does not exist because the last symbol, the ring address, dictates which
symbol is to be swapped with the first and thus the first symbol may not be swapped with itself.
The symbol swapping principle is symmetrical, therefore node[2134:2] is also connected to
node[1234:2]. This design has an node degree of only three outgoing links and is maximally fault
tolerant in that any two links can be removed from a node and it will still be connected to the
system. The size of a SCC(n) network is found by multiplying the size of the star(n)
network by the size of the lateral cycles and is therefore (n-1)n! nodes.
Many other topologies using unique symbol node addressing exist including: connected cube
cycles (CCC), pancake connected cycles (PCC), and bubble sort graph. Each one of these topologies
specifies a slight different method of bit swapping used to specify and route packets. At the
recommendation of , out of all the popular CC topologies I have chosen to simulate only the SCC,
which is reported to be among the most balanced and simplistic of the unique symbol exchange
The star connected interchange (SCI) topology is a minor variation of the SCC topology which I
have created for experimentation in this project. Rather than having each star(n) node be
replaced with a n-1 cycle it is replaced by a set of n-1 nodes which are fully
interconnected, thus each node has an n-1 degree of one network link and n-2 lateral
links. Although this node degree is slightly expensive, a SCI(6) network will connect more nodes
at cheaper cost than a hypercube(11). This implementation serves to simplify the route logic and
decreases both the average and maximum graph diameter, but it is questionable weather the addition
of these extra lateral links will be justified, as this will certainly make the network links
bottlenecks compared to the lateral links. Most likely, better SCI performance per cost could be
achieved by customizing network and lateral link widths to two separate values. Due to time
constraints, simulation of such a multi-link width topology was not attempted.
A de Bruijn graph or shuffle-exchange topology is based on a concept of bit shifting. Starting
from the source node, messages are routed by shifting a bit from the destination address into the
current node address and routing to the resultant address. This is repeated for each bit in the
destination address so that routing occurs like the following node pattern:
source = (S1,S2,S3) -> (S2,S3,D1) -> (S3, D1, D2) -> (D1, D2, D3) = destination
The de Bruijn graph is of particular interest because in remains the least costly methods of
connecting 2^n nodes and yet it still maintains a maximum diameter of n. This
approach has very limited routing options and no fault tolerance, but is a simple routing scheme
requiring few connections. A de Bruijn graph topology is important to MPP architectures because its
principles can be used to create shuffle-exchange sub-cycles in a topology to implement an
effective second level routing scheme that would reduce the overall connectivity requirement.
The hypercube is a topology where each node is connected to all other nodes that differ in
node address by only one bit. This requires a node degree of n outgoing links, where n is
the number of node address bits. Because the hypercube is so strongly connected it is highly fault
resistant, offering n disjoint parallel paths between any two nodes in the system. Thus up to n-1
nodes may fail in a hypercube before it becomes disconnected.
The multidimensional shuffle-exchange network (MDSXN) is a hybrid of hypercube and de
Bruijn topologies. The node address bits are divided into groups of sub-dimensions, and packets
are routed in separate phases along each sub-dimension. An illustration of a MDSXN[1,2] is
included as Appendix II of this report. Sub-dimensions consisting of two
or more address bits are connected in a de Bruijn cycle, and sub-dimensions of only one address
bit are one dimensional hypercube-like links. Thus a MDSXN[1,1,1] is a hypercube(3) and a
MDSXN is a de Bruijn(3). The node degree of each MDSXN node is equal to the twice the number
of sub-dimensions with greater than one dimensional address bit plus the number of
sub-dimensions with one dimensional address bit. With each additional sub-dimension the MDSN
topology gains a degree of fault tolerance and the minimal traffic routing patterns becomes more
distributed or balanced at a cost of a greater number of links. Simulations show that MDSXN
perform optimally when the address bits are divided into equal sized sub-dimensions.
Tori, or their ring-closing link void cousins, meshes, offer a topology requiring only very
simple routing and are the easiest of all indirect topologies to scale. They have a node degree
equal twice the number of dimensions the have and a diameter of the sum of half of the number nodes
in each dimension. It is important to realize that tori and meshes of a greater dimension than the
two or three dimensional geometric shapes exist and hold similar properties. They are a fault
tolerant up to the point of failures numbering one less than the size of the smallest dimension.
Unfortunately I lacked the time to implement a torus topology class for simulation, as I had