Open main menu

CDOT Wiki β

Enterprise Hyperscale Lab

Revision as of 10:47, 13 May 2015 by Hong Zhan Huang (talk | contribs) (Equipment Detail)
The EHL in January 2015.

The Enterprise Hyperscale Lab is operated by the OSTEP team to perform research on open source technologies and emerging hyperscale SOC-based systems.

Equipment Overview

The EHL consists of a dual-thermal-zone (cold/hot) rackmount cabinet with power conditioning and backup, power distribution, thermal monitoring, and 1- and 10-gigabit network services. This cabinet supports a large number of hyperscale and SOC-based ARM computers for various applied research projects.

A second equipment cabinet very similar to the first is on order and will be installed in Summer 2015.

Equipment Detail

Cabinet

Each EHL cabinet is a Silentium Accoustirack Active, a full-height acoustically-insulated rackmount cabinet with two fan units. The lower fan unit takes air from outside the cabinet and blows it up the front of the rackmount equipment area (cold zone). Air passes through the individual devices and is vented out the back of each unit into the hot zone. A second fan unit exhausts air from the hot zone out the top of the cabinet.

Each of the fan units includes active noise cancellation so that the loaded rack can be operated in a software development lab context.

After the fan units are installed, there are 33u available for other devices.

Cabinet 2

As of May 2015, the parts that would make up the second EHL cabinet has been temporarily assembled in storage for use with the AMD Seattle.

Power

Power is provided by two 115 volt, 15 amp, 60 cycle circuits that power two independent APC 1.5 kW rackmount power supplies. Each of the power supplies has a network interface for remote monitoring and control.

The UPSes feed two Raritan Dominion power distribution units (PDUs), mounted vertically up the sides of the cabinet at the back. The PDUs act like long power bars, but have control and monitoring systems so that the per-outlet current consumption can be measured and remotely monitored over the network, using snmp or http protocols. Each outlet can also be switched on or off under network control.

Where possible, devices have been configured with dual power supplies. This means that many of the devices in the EHL are plugged into both PDUs and can continue to operate when power is cut to one of the two supplies. This permits the EHL to be rewired without any downtime; it also guards against downtime due to PDU, UPS, or PSU failure.

Devices which are not configured with dual PSUs are connected to just one of the PDUs.

The total draw of the EHL equipment installed in the first cabinet as of January 2015 was approximately 1.8 kWh under load. The current power system can support a little over 3 kWh; the cabinet supports thermal exchange of about 8 kWh.

Cabinet 2

Similarly to the first cabinet, power is provided by one APC 1.5kW rackmount power supply. The UPS then feeds one Raritan Dominion PDU. The configuration of these pieces are near mirrors of cabinet 1 and thus they have the same capability to remotely control and monitor the status of each piece of the network. The PDU for cabinet 2 currently is utilized only by the AMD Seattle.

Environmental Monitoring

One of the PDUs is equipped with a string of three environmental sensors. These are laid out diagnonally across the EHL, so that the air intake temperature, mid-cabinet hot zone temperature, and air exit temperatures are monitored, as well as the humidity.

In January 2015, typical intake temperatures on the EHL were 25-27C and exhaust temperatures were 36-37C.

Networking

One Cisco 24-port gigabit switch and one Netgear 24-port 10-gigabit switch are installed in the back of each EHL cabinet. The 10g switch provides both 10GBASE-T and SFP+ connections. Where possible, SFP+/DA (Direct Attach) copper cables are used because they are simpler and less expensive than fiber optic cables, and yet offer much lower latency than 10GBASE-T connections (2 nS vs 2 mS - one million times less latency); other connections are made with fibre optic transcievers or 10GBASE-T copper connectiosn are required. Devices which do not support 10 gigabit connection are connected with 1 gigabit or 100 Mbit ethernet.

The connection between the EHL cabinets is made by a fibre optic 10 gigabit connection.

Connections between the EHL LAN and the outside world are provided by a Utilite, a small ARM computer installed in cabinet 1. This computer acts as a dual-homed host that provides firewall, NAT, forwarding, DNS, and VPN endpoint services.

Storage

Storage is provided by a Synology Rackstation in cabinet 1, which provides both storage area network (SAN, raw block devices over protocols such as iSCSI) and network-attached storage (NAS, filesystem-level shared block devices over protocols such as NFS and SMB). It is populated with twelve 1 TB SSDs and equipped with dual power supplies and dual 10-gigabit ethernet.

Terminal Server

Many of the computers installed in the EHL do not have video output (because they're not intended for desktop applications). Most of these have a serial port; in many cases, this is a virtual serial port which is accessed using the ICMP SOL (Serial-over-LAN) protocol on the network. Since this is not a TCP/IP protocol, the client must run somewhere on the LAN.

For systems that do not have a working ICMP engine, each EHL cabinet is equipped with a Cyclades Terminal Server which provides remote access to 32 serial ports. A remote user can connect to a selected port to monitor and control the connected system.

Display

Cabinet 1 is equipped with a 15" 4:3 LCD monitor, bolted to a 4u blanking panel. This display is driven by a Raspberry Pi, and can be used to show educational information about the rack, current system status information, or diagnostic data.

Calxeda/Boston Viridis ARM System

32-bit ARM compute is provided by a Calxeda Energy Core ECX-1000 system from Boston Limited. There are three installed "Energy Cards", each with 4 ECX-1000 nodes, which each have a quad-core ARM Cortex-A9 processor and a small (Cortex-M) ARM management processor. This system runs the Pidora build system (except for the Koji hub and web nodes).

64-Bit ARM Compute

The EHL has over 100 cores of ARM64 compute, provided by a number of computers from multiple vendors. These computers provide a build system and testing platforms for software optimization, and are used for applied research on ARM64 systems.

Funding

The EHL is generously funded by an NSERC Applied Research Tools and Instruments (ARTI) grant under the CCI program.

Additional systems installed within the EHL have been provided by OSTEP applied research partners.

Location

EHL is located in CDOT.

Student and Open Source Community Access to EHL Systems

Select systems within EHL are accessible to both open source community members and to students when they are not in use for other OSTEP research.

Pidora Koji System

The Koji buildsystem used for the Pidora project, koji.pidora.ca, uses Calxeda nodes in the EHL as build servers, and will in the future use a system within the EHL as a hub. This system is publicly accessible.

Student Access

Students in SPO600 and SBR600 are given remote access to some EHL ARM computers for specific projects and labs, when they are not being used for OSTEP applied research projects.