RUNET IMPLEMENTATION MODEL

Document Managed by Network Architecture

Introduction

The role of networking and computing at Rutgers University has become increasingly vital over time. The ubiquitous application of network resources for local as well as distant connectivity has become critical to the academic mission of the University. The transition of the Rutgers University Network (RUNet) from a select research asset to a strategic University resource is well under way. The goal of the RUNet 2000 project will be to facilitate this transition by constructing a flexible, fault tolerant network to connect all of Rutgers University.

Nomenclature

The consistent use of vocabulary provides for a predictable mode of communication and allows for effective discussion of the network design. There are a number of terms that have widespread use within the telecommunications industry and at the University. Where standard usage differs, the preferred interpretation will be provided or the potentially ambiguous expression will not be utilized.

The term Local Area Network (LAN) as used broadly in telecommunications refers to a relatively large network within a 1K to 2K limit. Unfortunately, it also has broad, but different usage at Rutgers University as a term that describes a departmental subnet. It would not be incorrect to utilize this term either way and allow context to make it clear which definition applies. Unfortunately, this has the potential to obfuscate the message at a time when clarity is critical. It is not a fundamental goal of this document to modify the working vernacular of the University community, but rather to select language and terminology to more effectively share this information with the rest of Rutgers. Thus, the term LAN will not be utilized in this document.

The use of a geographic or proper name signifier as a prefix or postfix to an otherwise generic expression indicates a reference to either that section of a larger element that is wholly contained or the signifier and its wholly contained sub elements.

Hill Center network - That section of the Rutgers network that is wholly contained within Hill Center. This would include the Hill Center core, the Hill Center distribution, and Hill center access.

Busch network - That section of the Rutgers network that is wholly contained on Busch campus. This would include the Busch core, all contained distribution networks, and all contained access networks.

New Brunswick network - That section of the Rutgers network that is wholly contained on the Busch, Livingston, College Avenue, Cook and Douglas campuses. This would include the respective core, distribution, and access networks.

Camden network - That section of the Rutgers network that is wholly contained on the Camden campus.

etc.

It can be seen that the recommended vocabulary for RUNet is a consistent set of terminology that permits clarity in communication. The scope of an expression, either geographic or topological, should be apparent at all times.

Network layers

The physical layer (L1) consists of the actual cable utilized to transport data. RUNet will be constructed using multi mode fiber optic cable, single mode fiber optic cable, and twisted pair copper cable. Fiber optic cable and twisted pair cable are frequently referred to simply as glass and copper respectively. An L1 connection between two nodes requires direct physical cabling.

The fiber plant for RUNet is a three tier hierarchy that was partially designed to minimize trenching. There are three physical layer building designations in the RUNet cabling plan:

  • A - contain intercampus fiber
  • B - contain intracampus distribution fiber
  • C - terminal buildings

An A building is the root of a fiber tree that extends beneath it. The A buildings are directly connected to multiple child B buildings, but also have direct fiber paths between themselves. They act as the terminal locations for intercampus fiber. B buildings are utilized as fiber distribution sites by being directly connected to one parent A building and multiple child C buildings. The fiber plant does not extend past the C buildings. From this perspective, they are physical leaf nodes.

L1_tree.gif

Figure 1: Designed fiber plant.

The data link layer (L2) will be primarily constructed utilizing a variety of technologies (ie: Ethernet, fast Ethernet, gigabit Ethernet, etc). Because RUNet will not utilize hubs as part of the infrastructure, L2 devices can be assumed to be switches. By definition, a direct L2 connection requires point to point transmission of data without hardware address rewrite or decremented TTL. Thus, direct point to point links as well as switch fabrics qualify as direct L2 connections. The extensive use of switching technologies will reduce unicast collision domains while permitting the use of appropriately sized broadcast domains.

The network layer (L3) will be constructed utilizing routers. Despite contemporary industry hype, this document assumes that all L3 devices are routers performing hardware address rewrite and decrementing TTL. By definition, an L3 connection or L3 neighbor relationship between nodes requires a complete L2 path without any intervening L3 devices. The network design will, in conjunction with the remainder of the telecommunications industry, converge on IP only as the routed protocol of choice.

Design model

The basic design consists of isolated trees connected at their root nodes and utilizing four elements: core, distribution, access and leaf nodes (see figure 2). The network design model is independent of the network fiber plant, however, the connection of network elements using a tree topology is consistent with both the underlying fiber plant and the proposed routing protocol. Rutgers University will rely on the Open Shortest Path First (OSPF) routing protocol, which functions significantly better in tree like topologies.

Design.gif

Figure 2: Topological map of the design network representing core (C), distribution (D), access (A), and leaf (L) nodes.

A leaf node (L) is a terminal network entity and represents a host client. The legacy network has approximately 20,000 host clients and this figure is expected to double over the next four years. Thus, the overall design of RUNet is currently target for 50,000 to 60,000 leaf nodes.

The access layer will be composed of L2 devices (A) contained in regulation wiring closets. Wall plate outlets in buildings will be wired directly back to an access switch to provide switched networking to all wired locations. An access network (AKA access tree) is the section of the access layer that is entirely contained beneath a single distribution routers. Access networks are not multiply connected to separate distribution routers.

The distribution layer will consist of both L2 and L3 devices. The L3 devices (D) will provide policy enforcement and broadcast containment for one or more access networks. Data meeting policy criteria is permitted to pass into the distribution layer from which unrestricted access to core routers will be possible. The L2 devices will be utilized to provide economical and scalable connectivity between adjacent L3 devices. A distribution network (AKA distribution tree) is that section of the distribution layer that is entirely contained beneath a specific (or matched set) of core routers.

The core layer will be composed of L3 devices (C) intended to forward packets only. It is a primary goal of RUNet to construct a policy free core that passes data without imposing any restriction. The segregation of policy in a consistent fashion to the distribution layer permits the core to be policy free. A core network is that section of the core layer that appears as either a single core router or multiple core routers that have core layer compliant connections (to be defined later).

The above topological relationships between nodes are consistent with all requirements established in the RUNet Design Model.

Wide Area Network

A remote connection is a reduced bandwidth connection that is the result of some combination of distance, infrastructure ownership, or economic consideration. A remote site is one network location that is separated from another by a remote connection. A remote site can consist of a single host or an entire core network. There can be no individually remote sites as remote implies, by definition, a peer relationship. Thus, Camden is a remote site from the perspective of New Brunswick and vice versa. The Rutgers Wide Area Network (WAN) will consist of all remote connections connected all Rutgers' remote sites.

Common sense dictates that proximity will have a profound impact on connectivity. The subnets within a building are largely intended to be collected behind a single distribution router and thus form a distinct access network. A tight collection of buildings will generally be collected together under a single (or matched set) of core routers and hence be wholly contained within a distinct distribution network. This logic extends upwards to allow for geographically separated campuses (ie: Camden, New Brunswick, and Newark) to be logically associated via remote connections (part of the WAN) without implying different treatment within the campuses themselves.

Remote sites of sufficient size will be otherwise RUNet compliant. They will be constructed utilizing identical technology at the access, distribution, and core layers. However, these core networks will be connected to the New Brunswick core network via remote links that represent best effort bandwidth.

Distribution scaling

The design model stipulates exactly two L3 hops between a leaf device and the core infrastructure. These two devices are the local distribution router and a core access router. This is consistent with the design model restriction that policy enforcement takes places on distribution routers only and that core routers are unrestricted. Thus, two L3 devices represent the minimum number of devices required to allow for both an unrestricted core and the existence of traffic policies.

CDscaling.gif

Figure 3: Topological map of the design network representing core (C), and distribution (D). In order to facilitate scaling, the core and distribution devices are separated by a switch layer.

The number of "A" buildings in the fiber plan are small in number and the total number of buildings at Rutgers University is large. Thus, it is reasonable to assume that the L2 connection between core and distribution would not be point to point. Instead, a layer of switching between core and distribution routers would be required to to permit adequate scaling and meet projected buildings counts. If the ratio between layers was approximately 8, the tree (see figure 3) would contain 64 distribution routers supporting a minimum of 64 buildings.

Bandwidth limits

While it is true that there exists a broad range of bandwidth possibilities for RUNet, the desire to provision for traffic aggregation and still provide for an improved network at all levels places pragmatic restrictions on bandwidth choices. The following are min/max guidelines for network bandwidth between the four respective network layers: core (C), distribution (D), access (A), and leaf (L).

connection minimum maximum
C-C 2x622 4x2488
D-C 622/100 2000/1000
A-D 100 1000
L-A 10 100

Table 2: bandwidth limits between node types based on L2/L3 designations.

The above ranges, consistent with the design model, were generated in accordance with the following constraints:

  • OC192 is not practical at this time as it would be difficult to incorporate from an interface perspective and it is not clear that it will be standardized to run over regular SM glass.
  • OC48 is currently the fastest practical bandwidth that can be natively driven over SM glass.
  • Fiber conservation encourages usage limits to be ratified and it proposed that no more than 4 pair of SM fiber be allocated for direct A-A core connectivity.
  • While multiplexing is possible, a worst case evaluation discounts its deployment and requires native fiber counts to be utilized instead. Multiplexing can be utilized to expand "apparent" fiber counts in the future or can be reevaluated directly if fiber constraints prevent consideration of topologies or hardware that would otherwise be attractive.
  • Dual values indicated between core and distribution reflect the possibility of L2 switching to expand L3 neighbor relationships without increasing router ports directly.

Bandwidth estimates reflect a goal to construct network designs that deliver an order of magnitude performance improvement. Most of the legacy networks at RU are constructed utilizing 10M shared media connected to 10M half duplex router ports. The building routers themselves have 10M half duplex connections to the RU backbone. The access layer is L2 by the design model and will be constructed utilizing 10/100 switching. An upgrade from 10M shared to 10M switched with 100M full duplex to the distribution router is an order of magnitude for all networks that contain 5 or more hosts. The minimum proposed distribution uplink is listed at 100M full duplex which is an order of magnitude improvement over the current 10M standard link.

Media types

The physical layer (L1) media choices are limited to single mode fiber, multimode fiber, and twisted pair. There respective use will be constrained as follows:

connection media choice
C-C fiber (SM)
D-C fiber (SM)
A-D fiber (MM/SM internal/external)
L-A copper
gigabit fiber (MM/SM internal/external)

Table 3: media choice for connection between network nodes.

The guidelines for glass will permit the use of single mode fiber for A-D when the outside physical plant is utilized. This will allow for the respective distribution router to be located in a parent building without hitting distance limitations. Further, all gigabit Ethernet will utilize glass.

Topology

One goal of the implementation model is to treat all client buildings similarly and not have network access be significantly impacted by the vagaries of local geography. This requires a consistent and modular topology that still permits expansion and that can be adapted when special requirements are identified. In addition, a shallow L3 depth to network trees will reduce data latency. Thus, all hosts will traverse a comparable path to the network core.

The above can only be accomplished if all buildings are addressed in a topologically similar fashion. Regardless of physical building designation (A, B, C), traffic must traverse comparable paths to the core. Since the design model did not prohibit multiply connected devices, traffic would certainly be free to traverse shorter paths when the destination is topologically close.

However, the fiber plant is essentially constructed utilizing 3 distinct layers (A, B, C). Thus, the hardware indicated in figure 3 can be designated by letter to show building location (see figure 4).

CDbuildings.gif

Figure 4: Scaled distribution tree with building designations.

This particular arrangement was chosen to facilitate balanced subtrees. The network topology should ideally be constructed utilizing network sections that display self similar behavior. This is consistent with the assumption that all buildings will contain leaf nodes and require network connectivity.

If self similar sub trees are mapped, preserving all edges and vertices, onto the actual building plant, the original tree topology could be redrawn as shown in figure 5.

CD_minimum_topology.gif

Figure 5: Distribution topology.

Note, that figure 5 implies the use of additional fiber between the A and B buildings. In general, if the network is constructed utilizing self similar tree components and one goal is general balance between distinct tree sections, each layer of the tree would require 1 additional fiber pair than the layer beneath. If the fiber plant was constructed utilizing 5 physical layers (A ... E), the E-D connectivity would require a single pair while the B-A would require 4 pair.

The simple tree (figure 5) contains 7 distribution routers representing a worst case scenario of 7 buildings. This tree contains 1A, 2B, and 4C buildings. In this example, the level ratio is 2. If the geometric progression is assumed to be constant (ie: N^0 core boxes, N^1 distribution switches, and N^2 terminal buildings) then this topology satisfies the following relation:

D = 1 + N*(1 + N)

If the scale factor is 8, a simple tree represents 73 buildings. There are no hard requirements that the level ratio remain constant. If the the respective values are n distribution switches per core router and m distribution routers per distribution switch, the number of distribution routers represented is as follows:

D = 1 + n(1 + m)

Figure 6 explicitly adds building boundaries to the network topology indicated in figure 5. As indicated previously, the cost of balanced subtrees is one additional pair of fiber for each tree layer.

CD_minimum_buildings.gif

Figure 6: Distribution topology with building boundaries.

The ability to add redundancy appears because the distribution switch contained in a B building is directly accessible by the distribution router for the building itself (see figure 7).

CD_minimum_redundancy.gif

Figure 7: Distribution topology with redundant links.

Figure 7 represents genuine redundancy that is obtained at very little cost. In addition, the equal cost path with immediate reconnection is very amenable to OSPF. However, shallow trees may not scale appropriately for a network that is the size of Rutgers. It may not be possible to limit typical trees to 73 buildings. The topological solution is horizontal replication of the basic tree structure (see figure 8).

CD_horizontal_scaling.gif

Figure 8: Distribution topology with horizontal replication of primary tree.

Figure 8 displays horizontal replication (widening the tree) which retains the basic three layer structure consistent with the proposed fiber plant. It does not add significantly to the overall complexity, but does increase overall capacity to approximately 145 buildings. Redundancy is also enhanced by doubling the number of core access routers to at least two per OSPF area. The connecting link between the core routers would permit intra-area traffic to traverse. To provide better balance between the two halves of the area, the link between the area border routers may be trunked. Horizontal replication of the primary topology is appropriate because it scales linearly while vertical replication (increasing tree depth) scales geometrically and quickly creates trees that are too large.

CD_horizontal_buildings.gif

Figure 9: Distribution topology with building boundaries.

Figure 9 shows a complete tree that has been replicated horizontally and also indicates building boundaries. This representation is not a requirement and it should be noted that building boundaries are physical while network topology is fundamentally logical. There is little to be gained by stipulating network topology that is inconsistent with physical reality, but there is also little to be gained by forcing a direct correspondence between them. The core routers will require intercampus connectivity and it is reasonable to locate them in buildings that contain intercampus fiber. Thus, common sense dictates that the core routers will be located in A buildings.

The need to provide distribution routers to terminate all access networks does not require that all buildings have a router. The example network (see figure 10) represents a small cluster of buildings that are connected together in an RUNet compliant topology.

CD_example_1.gif

Figure 10: Sample distribution topology designed to service a small collection of buildings. The connections and bandwidths indicated are all RUNet compliant.

Figure 10 contains one A buildings, two B buildings, and three C buildings. The redundancy link between the distribution router and a collocated distribution switch is indicated in the B building. It can also be seen that one buildings does not contain its own distribution router. In addition, an exception case is portrayed in that a terminal fiber plant building (C) is directly connected to an A building. Regardless of these physical complexities, the generic logical topology maps directly onto the physical infrastructure. All buildings contain leaf nodes as indicated by the presence of an access network at each location and each access switch is appropriately connected to a single distribution router. The standard proposed topology is sufficiently flexible that all buildings can be accommodated in a consistent and predictable manner.

Legacy Network

The legacy network will connected to the RUNet network through a single L2 core connection and will be a direct L3 neighbor to at least one core router. The core will not support policy in any form, thus all traffic that comes from the legacy network will pass through a transition router. The transition device is not an RUNet device, but will provide policy enforcement and traffic evaluation such that all traffic that comes from the legacy network will be sanitized before entering the RUNet core.

External

Internal, from a network perspective, is defined as utilizing Rutgers address space, abiding by Rutgers policies, and controlled by Rutgers traffic rules. All address ranges that are formally owned by Rutgers University are internal. This includes 128.6.0.0/16, 165.230.0.0/16, several /24 address ranges, and all address ranges that are deemed private by the Internet. For example, 10.0.0.0/8 is a private address range, and as such is intrinsically owned by everyone. Private addresses can not be passed outside of the AS, and thus can be utilized internally. In addition, internal is described as Rutgers' policy space, which includes the ability to set and enforce protocols as well as perform traffic management. Thus, external is the formal compliment of internal (that which is not internal).

Internal and external are separated by an external handoff group (AKA E, see figure 11). An external handoff group is not a distinct hardware component, but rather a coordinated set of hardware and features corresponding to a defined set of policies and restrictions that separates internal space from external space. These groups must contain an L3 device, referred to as an external handoff router, and may also contain a distinct firewall in addition to public or limited access networks. External groups are permitted to be singly or multiply connected to either core routers that are the root of distribution networks or to dedicated core devices.

E_group.gif

Figure 11: External handoff group E.

The E function is a transition/barrier between internal and external space. The public network is a designation for a network visible to external space that has no restrictions excluding those that are applied in general at L3 on the external handoff router. The DMZ is restricted space that can be utilized to deliver external services or to support proxy servers. The topological representation above is intended to be logical. To be more specific, a public network would be connected directly to the external handoff router while the DMZ would be connected directly to the firewall. Where the firewall function is performed on the external handoff router, both the DMZ and public networks would be directly connected to it.

Direct external to external traffic is defined to be traffic with both the source and destination outside of the internal space. This mode is neither directly supported or precluded, but rather will be evaluated on a link by link basis. By way of example, we would not want this traffic pattern to take place between our two primary Internet feeds, but we may allow UMDNJ to send traffic out to the Internet through Rutgers.

While it might be attractive to collect all external routing to a single handoff router, this may not be practical as a primary implementation requirement. Rutgers will undoubtedly require multiple external routes, some of which will be topologically separate. Multiple E's are permitted and they are not required to be L3 neighbors.

External handoff routers will be permitted, but not required, to run BGP. Direct communication between external handoff groupings is neither required or prohibited. It is certainly necessary to provide a BGP mesh for unambiguous routes to primary Internet handoffs, but a secondary external route would hardly be ambiguous without direct E-E communication. The real question regarding E-E communication centers around the need to have external routing information exist unambiguously within the AS. A specific example would be duplicate BGP routes (with differing costs) that may propagate from the current BBN and UUNet links.

The E group is ultimately permitted direct contact with the core. This is more critical for routers within topologically separated E groupings that contain BGP speaking routers which require a communication mechanism. Allowing the E group to be directly connected to the core will allow for greater flexibility in placement while still allowing BGP meshes. The E groupings are not prohibited or required to connect to core devices which are in direct support of the distribution layer.

  • firewall functionality can exist in the external handoff router.
  • firewall functionality can exist in some other device at either L2 or L3.
  • unrestricted public services network permitted.
  • partially restricted DMZ network permitted.
  • External to external traffic is permitted or prohibited on a link by link basis.
  • May be multiply connected to the core.
  • core router connection may be through dedicated devices or via core routers serving distribution networks.

The primary Internet handoff routers could thus be contained within an E group that contains all of the listed features. Those external groups of lesser importance will incorporate less, the relatively broad definition for the E group will allow for both.

While complete topology has not been dictated, Figure 12 is offered as indicative of the requirements and prohibitions that have been delineated.

E_example.gif

Figure 12: Example of RUNet topology that contains an external handoff group, wide area links to two important remote sites, and a connection point for the legacy network.

Figure 12 portrays network topology in the vicinity of a primary Internet handoff. The indicated E group supports all specified functions and is clearly an L3 neighbor to two core routers. In addition, these core routers support wide area links to the Newark and Camden campuses respectively. This allows for the two remote campuses to enjoy excellent access to the primary Internet handoff, comparable to the other campuses. The legacy network is similarly positioned to facilitate excellent external connectivity.

Conclusion

The RUNet implementation model is a work in progress that will be utilized to deploy the final network. It contains all of the requirements and restrictions to effectively reduce the overall deployment state space to a minimal set of choices. The implementation model is entirely consistent with supporting models and documentation and is expected to achieve 85% direct applicability.