The Infrastructure Upgrade: The Story Continues

by Annie Yu

In the previous issue of Network Computing, we have touched on a few aspects of the large-scale plan for IT Infrastructure Upgrade, namely what actions should be taken to remedy the weak points of the Campus Network, the addition of resources to some of the critical central servers and the largest ever PC upgrade exercise which was conducted in full swing only recently. In order to complete the picture of the plan, we shall examine in turn each of the other areas that also require immediate attention. They are:

1) Upgrade the network infrastructure and bandwidth

Over the last 15 years, the CTNET, CityU’s campus network, had undergone several phases of development in order to cope with the increasing needs and to adopt new technologies. Currently, the network consists of two parallel backbones, one for carrying ordinary data and the other for video. By adopting this model, we can be sure that optimal performance can be achieved when running services like CityVoD, video conferencing, video and TV broadcast, digital image library and multimedia applications without affecting the network traffic of services provided on the data network.

At the moment, the campus network is still sufficient to handle the daily traffic, but what about in a few years’ time? We can foresee that a humongous amount of bandwidth will be required, which might result from an increasing number of network nodes, increasing processing power of PCs and workstations, rapid proliferation of multimedia applications and so forth. A major upgrade is considered necessary and its benefits are manifold as described below:

a. The existing network is found to have problems or limitations. To list a few:

  • The 2 backbones each plays a specific role. Mutual backup is therefore extremely difficult if one of them fails.
     
  • Local Area Network Emulation (LANE) services impose a significant overhead to the ATM core, which render the ATM backbone unable to support high volume of broadcast and multicast traffic.
     
  • One-armed routers are used to forward intra-Virtual Local Area Network (VLAN) data traffic on the ATM network. These routers are relatively slow compared with layer-3 switches and therefore have become a bottleneck for delay-sensitive and bandwidth hungry applications.
     
  • Quality of Service (QoS) is not supported by the existing ATM network, which is based on LANE 1.0. It must be upgraded to MPOA, which includes LANE 2.0, in order to provide QoS.
     
  • The backplanes of the ATM switches are limited to 3.2 Gbps, which means future upgrade of the ATM network to support gigabit networking will be a concern.
     
  • An average Ethernet or Fast Ethernet port on an ATM switch is comparatively more expensive than a 10/100 Ethernet port on a layer-2 or layer-3 Ethernet switch. Likewise, its annual hardware maintenance cost is higher.
     
  • ATM technology is superseded by Gigabit/Fast Ethernet technology in the majority of new campus network installations/upgrades. Putting additional investments on a technology near its end of life cycle seems inappropriate.
     
  • Existing installed multi-mode fibers are not capable of supporting gigabit speed at a distance of over 300 metres.
     
  • Network monitoring tools are inadequate for the collection, comparison, and analysis of network performance and operational metrics. Shortage of manpower for network monitoring is yet another issue.

It is envisaged that most of the aforementioned problems can be resolved by the upgrade.

b. The upgrade will put the University into a competitive position against other local universities, most of which are in the process of upgrading their campus networks (Specifically, HKUST will provide 100Mbps for each network connection, and 1000Mbps for each network uplink. It sounds overkill, but the provision of network bandwidth is a factor being considered when a comparison is drawn between the universities).

c. The 2 existing backbones will be merged into one based on Gigabit/Fast Ethernet and layer-3 switching technologies, resulting in a simplified network architecture which will ease network management and support efforts, as well as reduce TOS (total cost of ownership).

d. QoS will be supported, and therefore delay-sensitive applications such as Video-conferencing VoD, TV broadcasts and Voice-over-IP can be better supported by the network.

e. Gigabit Ethernet uplinks will be provided between wiring closets and the core switches. The fiber cable plant will be upgraded to embrace single-mode and the new 50 mm multimode fibers. The new cable plant will be able to support gigabit networking as well as the forthcoming 10 Gbps Ethernet standard.

f. Network monitoring tools will be used to monitor the network and collect real-time or historical metrics. Such metrics can be used to establish baseline measures and watch for trends.

g. Additional VLANs can be supported. Each department may have its own VLAN(s) if necessary.


2) Consolidation of Departmental Local Area Network (LAN) Servers

At the moment, the existing 55 staff departmental LANs consist of 60 NT servers and around 3000 Windows 98 clients. Among the 60 NT servers, 5 have been configured as a single master domain with one as Primary Domain Controller and 4 Backup Domain Controllers. The master domain servers handle user account maintenance, logon validation, WINS, central software backup services for Windows 95 and 98, central Intranet menu, and as backup resource domain servers for departments. One disadvantage of the latter setup is that disk space on NT servers are more difficult to expand and less reliable, and users therefore tend to store their data on other central servers such as email servers. Furthermore, huge disk size and disk space for data and programs are increasing so fast that existing tape drive will soon no longer be able to handle their backup and restore within reasonable time.

The remaining 55 NT departmental servers, that serve around 60 departments, have been configured as resource domains which support departmental software installation, file and print services, departmental Intranet menu and user-specific file sharing.

Depending on the licenses, usages, and sources of funding, some client software with network version are installed on departmental servers while some with individual licenses are on clients’ PCs. Network version of software owned by one department will be installed only on the departmental server of that department. Thus, some popular software inevitably will have to be duplicated in each of the departmental servers, thereby increasing the supporting effort.

In order to eliminate the shortcomings of the existing setup and enhance our LAN services, it is proposed that:

  • The master domain servers are to be replaced with servers running Clustering and NSS technologies. With these technologies, chances of server down or disk failure will be significantly reduced. Expansion or dynamical allocation of disk space will be much easier and more flexible. Furthermore, additional critical functions of departmental servers can be replicated in these servers for fail-over purposes.
     
  • All popular software be relocated from departmental servers to the Clustered servers such that only a single copy for each software needs to be maintained thereby reducing the supporting effort.


3) Establishment of a Central Network Storage System (NSS)

Today, servers across the LANs or Wide Area Networks (WANs) need to provide fast, reliable, ubiquitous access to data as required by various applications. Moreover, our mission-critical applications, in particular, require a data storage that can be operational all year round and can be expanded easily without interrupting any application currently running. At present, all our data is stored in the direct attached storage. Centralised storage systems such as the network-attached storage (NAS), the Storage Area Network (SAN) and Enterprise Storage Network (ESN) are now available in the market. Adoption of a Central Network Storage System (NSS) is definitely the direction to go and its benefits are quite apparent:

a) High Availability and Disaster Recovery

NSS has the ability to make any-to-any connections among multiple servers and storage devices. They can create a shared pool of storage, which can be accessed by multiple servers through multiple paths, resulting in high availability. NSS therefore provides an excellent environment for clustering that can extend to dozens of servers and storage devices. If storage devices fail, servers continue to function because another copy of data is still available through NSS.

Through NSS, many vendors now offer 100% data availability; this makes long backup time less critical and disaster recovery less an issue.

b) Backup

NSS simplifies the backup by sharing tape drives among servers. It also allows a single backup be handled by multiple tape drives simultaneously, thereby reducing the backup time.

c) Data Sharing

NSS allows multiple distributed servers to concurrently access a centralised storage for data sharing applications. With NSS it is easy to deploy more idle servers to handle those mission critical applications with a short peak usage periods (e.g. students registration in summer time) and to resume the server number to normal once the peak usage periods are over.

d) Management and Future Expansion

Storage devices could be distributed through a network yet managed from a central point through a single management tool. This will lower the cost of storage management, standardise the control of the administrator and hence provide better reliability and availability. New devices can also be added online without disrupting data access.

It is planned to implement the NSS first on all mission-critical central hosts such as web servers, email servers and servers of the departmental staff LAN, and later to cover most of the central requirement for storage.