Bolster your Exchange server's resilience
Message servers became more resilient last fall when Microsoft released two products: Microsoft Cluster Server (MSCSformerly code-named Wolfpack) and Exchange 5.5, Enterprise Edition (Exchange 5.5/E). MSCS is the first clustering solution for Windows NT to fully support Exchange Server in production environments. And Exchange 5.5/E, released in November 1997, formally supports clusters. Many large installations have been eagerly awaiting Exchange 5.5/E because they want to build very large servers or consolidate several small servers into large clusters to reduce the load of systems administration. Exchange 5.5/E offers two technical advancesclustering and an unlimited information storethat are key to building very large servers.
Putting more than a thousand mailboxes on a nonclustered servereven a server that's protected by a UPS, a high-specification RAID-5 array, and other resiliency featuresis an act of faith. A hardware or software problem can interrupt the email service, and people can't do their work. Exchange 5.5 addresses the problem by supporting an active/standby cluster configuration. (For descriptions of clustering terminology, see Joel Sloss, "Clustering Terms and Technologies," June 1997.) Exchange ordinarily runs on the active node in the cluster; if a problem occurs, MSCS automatically transfers work to the standby node, which becomes active and continues to process user requests. Let's see how MSCS and Exchange 5.5/E can work for you.
Helping Exchange Understand Clusters
Engineers who want to support Exchange in clusters have several challenges, in addition to the obvious requirement to provide redundancy through hardware. One challenge is how to handle the stores (Exchange's databases). Another challenge is how to associate user mailboxes with particular Exchange servers. Exchange configuration data relating to components such as bridgehead servers (servers that connect Exchange sites) for messaging and directory replication connectors can also be server-specific.
The Exchange information and directory stores use a complex transaction model; databases, transaction logs, and queues held in memory represent the full stores. Any failover must be able to seamlessly switch the stores back to the state they were in when a problem occurred. Exchange can handle this requirement because of its capability to roll outstanding transactions forward into the database from the transaction logs. The transaction logs also satisfy the MSCS requirement for data to be persistent. In other words, applications must always write data to a place where you can access it, even if the cluster fails over.
When any Exchange server suffers an unexpected failure (such as a power outage), Exchange automatically recovers transactions the next time the Exchange Information Store (IS) service starts. This process is a soft recovery. When the active server fails in a two-node cluster, Exchange performs a similar recovery: The newly activated server takes responsibility for committing any outstanding transactions to the database before letting users reconnect. Conceptually, therefore, the IS has always been reasonably well prepared for clustering.
Breaking the association between user mailboxes (and configuration data) and specific servers is trickier. Within sites, administrators can assign servers certain work. Some servers might handle only public folders, some deal with connections, and others might act as hosts for user mailboxes. Exchange knows the work that each server performs from information that the directory holds; the name of a physical server represents the namespace services use to access data.
Clusters make systems more resilient by ensuring that each server in the cluster can perform the work of its peers, if necessary. You must, therefore, develop a method to allocate work to the cluster as a whole, rather than to an individual server, and then modify the software to permit individual cluster members to assume tasks as the cluster state changes. To allocate work to the cluster, Exchange uses a cluster alias because an alias lets you address the cluster as a named entity. In short, you alter the namespace represented by a physical server to accommodate the concept of a virtual server whose workload any server in the cluster can perform. In clustering terms, you define the virtual server as part of a cluster resource group.
Another set of changes completes support for clustering in Exchange. Most of these changes are at the application level (the Exchange services such as the Message Transfer AgentMTAthe Information Store, connectors, etc.) on top of underlying APIs that MSCS provides. For example, Exchange treats each service as a separate resource, so an administrator can fail one service such as the Internet Mail Server (IMS) without stopping the IS. However, you can't fail one service and restart it on the standby cluster server. All Exchange services must run on the active cluster server.
In summary, Exchange 5.5/E supports clustering with these new features:
- A Setup utility cluster prompt: If you have installed MSCS, Setup will prompt you to install the cluster-aware version of Exchange, as Screen 1 shows. You can't install the standalone version of Exchange on a server where MSCS is active.
- Support for the concept of virtual and physical namespaces: Installing Exchange Server onto a cluster creates a new resource group to hold details of Exchange's clusterwide resources. These resources include the disks used to store the application files and binaries and a network name. The network name functions as the name of the virtual server that runs Exchange in the cluster, and it appears as the server name when you view the cluster through the Exchange Administrator program. You also use the network name to specify the name of the cluster when you define a bridgehead server for a messaging or directory replication connector.
- Support for cluster state transitions: When the active server fails, the cluster goes through a transition. Responsibility for running applications passes to the passive standby member, which now becomes active. Exchange 5.5/E includes modifications to let its services gracefully fail over to the other server. Transitions can also occur when an administrator voluntarily stops an application to perform maintenance.
- Administrative support for virtual servers: Microsoft has changed the Exchange Administrator program to let server monitors work with virtual servers. However, you can stop and restart services only by using the MSCS administration program.
Software and Hardware Requirements
To build an Exchange cluster, you need NT Server 4.0, Enterprise Edition (NTS/E), Exchange 5.5/E, and cluster-compatible hardware. You must configure MSCS and get it running before you install Exchange. You need the enterprise edition of Exchange because the standard edition doesn't support clustering, and the enterprise edition lets you use the unlimited store, which is the other essential component required to build very large servers. I based this article on my experience with a Digital AlphaServer 4000 cluster, composed of two 466MHz AlphaServer 1000 processors with 512MB of memory and a StorageWorks 450 RAID array.
You can build clusters only from specific hardware, so review your existing configurations to determine whether you can configure your hardware in clusters. (For a list of cluster-compatible hardware, see http://www.microsoft.com/ntserver/info/hwcompatibility.htm.) Microsoft does not support upgrades for existing standalone servers to form clusters. (Microsoft might relax its insistence that cluster hardware come from a controlled compatibility list as it gains more experience with clusters.)
Strictly speaking, the hardware must be symmetrical (i.e., the servers in the cluster must be identical in CPU power and memory), a requirement Exchange imposes to enable automatic tuning. The hardware must be symmetrical because the Performance Wizard (PerfWiz) runs only on the primary node. PerfWiz can't run on the secondary node because it can't access the shared disks where the Exchange data resides.
PerfWiz attempts to determine optimum performance settings for Exchange, including initial memory buffer allocations and the best (and the fastest) location for important Exchange files, such as the Information Store. PerfWiz writes this information into the Registry. If the hardware is asymmetrical, the performance settings for the primary node might not match the characteristics of the secondary node after a failover occurs, and performance will inevitably suffer. However, if the two servers have similar memory and CPU power, you can probably accept less than 100 percent of the performance settings after a failover.
Exchange 5.5/E introduces dynamic buffer allocation. This features is code that constantly monitors and adjusts the memory utilization of Exchange with respect to other programs' demands for NT's memory.
Dynamic buffer allocation negates some, but possibly not all, of the effects of failing over to asymmetric hardware. For best results, follow Microsoft's recommendations and use identical hardware for both nodes.
Installing Exchange into a Cluster
You must use the MSCS administration program to create a cluster group for Exchange before you begin the installation. Installing Exchange on the primary and secondary servers in a cluster requires different processes.
When you install Exchange on a server where MSCS is present, the Setup program creates the Exchange Server directory structure (usually \EXCHSRVR) on a shared cluster drive. You can't select a drive destination that isn't a shared cluster drive. Setup copies all the Exchange executables and data files to the selected drive. The Exchange executables used on a cluster are different from those used on a standalone system.
After Setup creates the directory structure, it creates and registers the Exchange services, copies system shared files into the local %ROOT\SYSTEM32 directory, and creates resource dependencies within MSCS. For example, the Exchange MTA depends on the Information Store. If the store isn't running, the MTA can't start.
When Setup has completed these steps, you can run PerfWiz. Note that PerfWiz analyzes only disks that are defined in the Exchange resource group; it ignores disks local to a server. In a cluster, you can't locate files such as the transaction logs on any disk that isn't available to the cluster as a whole. If you place files such as the transaction logs or Exchange MTA work files on local disks, cluster failovers won't work because the data needed to complete the transition won't be available. Although clustering provides some resilience, the shared disks still represent a potential single point of failure. Good backup discipline remains critical in a clustered environment.
A cluster can begin operating after you install Exchange on the primary node; you don't have to perform secondary installations immediately. Installation on a secondary server is simpler because most of the files that Exchange uses are already located on the shared cluster drive.
In a secondary installation, Setup copies system shared files into the local %ROOT\SYSTEM32 directory. Then Setup creates resource dependencies and creates and registers Exchange services. Exchange uses wizards to configure the IMS and the Internet News Server (INS). Microsoft has altered these wizards to deal with clusters; run them only on the primary node. However, Microsoft offers an update node option to update the Registry on the secondary node.