AppSuite:Running a cluster

Revision as of 14:27, 18 January 2013 by Tierlieb (talk | contribs) (Cluster Discovery)

Concepts

For inter-OX-communication over the network, multiple Open-Xchange servers can form a cluster. This brings different advantages regarding distribution and caching of volatile data, load balancing, scalability, fail-safety and robustness. Additionally, it provides the infratstructure for upcoming features of the Open-Xchange server. The clustering capabilities of the Open-Xchange server are mainly built up on Hazelcast, an open source clustering and highly scalable data distribution platform for Java. The following article provides an overview about the current featureset and configuration options.

Cluster Discovery

To form a cluster of multiple OX server nodes, different discovery mechanisms can be used. Currently, a static cluster discovery using a fixed set of IP addresses, and a dynamic cluster discovery based on Zeroconf (mDNS). The installation packages conflict with each other, so that only one of them can be installed at the same time. It's also required to use the same cluster discovery mechanism throughout all nodes in the cluster.

Static Cluster Discovery

The package 'open-xchange-cluster-discovery-static' installs the OSGi bundle implementing the OSGi ClusterDiscoveryService. The implementation uses a configuration file that specifies all nodes of the cluster. This cluster discovery module is mutually exclusive with any other cluster discovery module. Only one cluster discovery module can be installed on the backend. When a node is configured to use 'static' cluster discovery, it will try to connect to a pre-defined set of nodes. A comma-separated list of IP addresses of possible nodes is defined in the configuration file 'static-cluster-discovery.properties', e.g.:

   com.openexchange.cluster.discovery.static.nodes=10.20.30.12, 10.20.30.13, 192.178.168.110

For single node installations, the configuration parameter can be left empty. If possible, one should prefer a 'static' cluster discovery against the other possiblities, as it allows a new node starting up to directly join an existing cluster. However, probing for the other nodes could lead to a short delay when starting the server.

MDSN Cluster Discovery

The package 'open-xchange-cluster-discovery-mdns' installs the OSGi bundle implementing the OSGi ClusterDiscoveryService. The implementation uses the Zerconf implementation provided by open-xchange-mdns to find all nodes within the cluster. This cluster discovery module is mutually exclusive with any other cluster discovery module. Only one cluster discovery module can be installed on the backend. MDNS can be enabled or disabled via the 'mdns.properties' configuration file:

  com.openexchange.mdns.enabled=true

When enabled, the nodes publish and discover their services using Zero configuration networking in the mDNS multicast group. The services are prefixed with the cluster's name as configured in 'cluster.properties', meaning that all nodes that should form the cluster require to have the same cluster name. When using mDNS cluster discovery, nodes normally start up on their own, as no other nodes in the cluster are known during startup. Doing so, they logically form a cluster on their own. At a later stage, when other nodes have been discovered, nodes merge to a bigger cluster automatically, until finally the whole cluster is formed.

Features

The following list gives an overview about different features that were implemented using the new cluster capabilities.

Distributed Session Storage

Previously, when an Open-Xchange server was shutdown for maintenance, all user sessions that were bound to that machine were lost, i.e. the users needed to login again. With the distributed session storage, all sessions are backed by a distributed map in the cluster, so that they are no longer bound to a specific node in the cluster. When a node is shut down, the session data is still available in the cluster and can be accessed from the remaining nodes. The load-balancing techniques of the webserver then seamlessly routes the user session to another node, with no 'session expired' errors. Depending on the cluster infrastructure, different backup-count configuration options might be set for the distributed session storage in the map configuration file 'sessions.properties' in the 'hazelcast' subdirectory:

  com.openexchange.hazelcast.configuration.map.backupCount=1

The 'backupcount' property configures the number of nodes with synchronized backups. Synchronized backups block operations until backups are successfully copied and acknowledgements are received. If 1 is set as the backup-count for example, then all entries of the map will be copied to another JVM for fail-safety. 0 means no backup. Any integer between 0 and 6. Default is 1, setting bigger than 6 has no effect.

  com.openexchange.hazelcast.configuration.map.asyncBackupCount=0

The 'asyncbackup' property configures the number of nodes with async backups. Async backups do not block operations and do not require acknowledgements. 0 means no backup. Any integer between 0 and 6. Default is 0, setting bigger than 6 has no effect.

Distributed Indexing Jobs

Groupware data is indexed in the background to yield faster search results.

Adminstration / Troubleshooting

Hazelcast Configuration

The underlying Hazelcast library can be configured using the file 'hazelcast.properties'. In servers with multiple network interfaces, it might be useful to define a fixed interface that should be used with the paramter 'com.openexchange.hazelcast.interfaces'. Otherwise, Hazelcast listens on all interfaces. The Hazelcast JMX MBean can be enabled or disabled with the property 'com.openexchange.hazelcast.jmx'. The properties 'com.openexchange.hazelcast.mergeFirstRunDelay' and 'com.openexchange.hazelcast.mergeRunDelay' control the run intervals of the so-called 'Split Brain Handler' of Hazelcast that initiates the cluster join process when a new node is started. More details can be found at http://www.hazelcast.com/docs/2.5/manual/single_html/#NetworkPartitioning.

Commandline Tool

To print out statistics about the cluster and the distributed data, the 'showruntimestats' commandline tool can be executed witht the 'clusterstats' ('c') argument. This provides an overview about the runtime cluster configuration of the node, other members in the cluster and distributed data structures.

JMX

In the Open-Xchange server Java process, the MBeans 'com.hazelcast' and 'com.openexchange.hazelcast' can be used to monitor and manage different aspects of the underlying Hazelcast cluster. Merely for test purposes, the 'com.openexchange.hazelcast' MBean can be used for manually changing the configured cluster members, i.e. the list of possible OX nodes in the cluster. The 'com.hazelcast' MBean provides detailed information about the cluster configuration and distributed data structures.

Hazelcast Errors

When expeiencing hazelcast related errors in the logfiles, most likely different versions of the packages are installed, leading to different message formats that can't be understood by nodes using another version. Examples for such errors are exceptions in hazelcast components regarding (de)serialization or other message processing. This may happen when performing a consecutive update of all nodes in the cluster, where temporarily nodes with a heterogeneous setup try to communicate with each other. If the errors don't disappear after all nodes in the cluster have been update to the same package versions, it might be necessary to shutdown the cluster completely, so that all distributed data is cleared.

Cluster Discovery

  • If the started OX nodes don't form a cluster, please double-check your configuration in the files 'cluster.properties', 'hazelcast.properties' and 'static-cluster-discovery.properties' / 'mdns.properties'
  • It's important to have the same cluster name defined in 'cluster.properties' throughout all nodes in the cluster
  • Especially when using 'mDNS' cluster discovery, it might take some time until the cluster is formed
  • When using 'static' cluster discovery, at least one other node in the cluster has to be configured in 'com.openexchange.cluster.discovery.static.nodes' to allow joining, however, it's recommended to list all nodes in the cluster here