Saying HA! to Downtime

by Dave Michels

Saying HA! to Downtime Why you might explore high-availability solutions from NEC for your UC environment.

Why you might explore high-availability solutions from NEC for your UC environment.

As I mentioned last month on No Jitter, NEC combined its UC technologies with its broader data center portfolio as part of its “Smart Enterprise” initiative. From a solution perspective, this move enables NEC to deliver a more comprehensive UC solution. However, it also positions NEC as a potential complement to non-NEC UC solutions that require infrastructure for high-availability (HA) server and storage, networking, and video.

NEC offers two products for achieving HA: the FT series of fault-tolerant servers for machine-level or system redundancy, and the ExpressCluster software for protecting against process, site, and network failures. These products are for use in physical or virtualized Windows and Linux environments.

While the UC industry as a whole, including NEC, has shifted toward software-delivered solutions, reliable server infrastructure is still necessary. Most UC applications, for example, run on industry standard servers (ISS). Industry standard hardware has largely replaced more expensive, proprietary hardware, but places a burden on the customer to design, manage, and operate the infrastructure.

Unfortunately, few systems are truly designed for data center HA, and so most enterprises must design and configure additional components, such as a virtualized cluster or standby hardware to address their HA requirements. NEC offers hardware- and software-based solutions for improved availability of UC or other mission-critical apps. Here’s a look:

FT Servers
NEC claims its FT servers deliver 99.999% availability by effectively combining two systems into one. The servers appear to software as industry standard servers with current Intel Xeon processors, memory, and disks. The FT or fault-tolerance is performed within the hardware using NEC’s GeminiEngine chipset.

This technology provides continuous “lockstep” synchronization of the two servers, so that no data is lost in the event of a hardware failure. GeminiEngine ensures the secondary processor’s CPU, memory, and IO completely synchronize with the primary server. The redundancy and failover are built directly into the hardware, dramatically simplifying operations and eliminating the need for specialized or redundant licensing. Additionally, even though the app is running twice, the architecture only requires a single license (for the OS, too).

The normal state for a fault-tolerant server is duplex mode, where the hot standby machine is in lockstep synchronization with the primary server. NEC includes its ExpressScope software to provide real-time monitoring. If/when the server automatically fails over, ExpressScope provides visibility and notification via an SNMP trap.

While FT servers provide system redundancy, they do not monitor applications or network services. For this NEC offers ExpressCluster software.

It is a common misperception that system-level redundancy is enough; however, there are conditions where the server is fine yet still unavailable. For example, network congestion or application failures could create what users will perceive as downtime.

ExpressCluster provides continuous monitoring of applications and networks, and maintains synchronous data mirroring between servers. ExpressCluster is for use with or without FT servers, and can enable server infrastructure geo-redundancy across separate data centers. It is designed to minimize problems by initiating recovery measures across LANs, WANs, and storage-area networks.

Availability Mandate
While HA is often a stated IT goal, creating highly available infrastructure takes upfront planning. Common outages can cause significant downtime, and just a single hardware failure in a year could easily represent 40 hours of downtime. For example, a Saturday failure requiring new parts could last until Tuesday even with next-business-day service.

Designing for high availability is only partly about mitigating failures, and more about expecting them. A failure condition does not have to equate to downtime. HA planning necessitates designs with redundancies for key business application servers as well as systems for power, cooling, networking and other dependencies.

There are not a lot of fault-tolerant industry standard servers on the market. NEC has been doing this for years, and other vendors do OEM some of its models. NEC’s HA solutions can be attractive in terms of return on investment and total cost of ownership because they do not require specialized software, application licensing, or training. These options from NEC may be worth exploring regardless of selected UC vendor.

Dave Michels is a contributing editor and analyst at TalkingPointz.

Follow Dave Michels on Twitter and Google+!
Dave Michels on Google+