Difference between revisions of "Template:OXLoadBalancingClustering Database"

(Cluster startup)
(Configuration)
 
(42 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
== Overview ==
 
== Overview ==
  
You can choose between Galera or two-sided Master/Slave ("Master/Master") replication.
+
You can choose between Galera or Master/Slave replication. We like to recommend to use Galera for higher redudancy, easier operations, und synchronous semantics (so you can run OX without our "replication monitor"). For POC or demo setups, a single standalone database setup might be sufficient.
  
== Galera database setup ==
+
== Standalone database setup ==
 +
 
 +
=== Preparations ===
 +
 
 +
Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database server, please be prepared to wipe the datadir, i.e. take a <code>mysqldump</code> for later restoration into the properly configured master.
 +
 
 +
mysqldump --databases configdb oxdb_{5..14} > backup.sql
  
OX only supports the "Percona XtraDB Cluster 5.5" flavor of the Galera database.
+
Be sure to verify the list of databases.
  
 
=== Installation ===
 
=== Installation ===
  
==== Debian systems ====
+
Note: the following list is not an exclusive list or authorative statement about supported MySQL flavors / versions. Please consult the official support / system requirements statement.
  
The following has been adjusted to work with Wheezy, but works similar with Squeeze, only the repo paths need adjustments.
+
Please follow the upstream docs for your preferred flavor to get the software installed on your system.
  
To install the software, we first need to configure the repository and its build key, update our sources lists and install the packages:
+
* MariaDB (10.1, 10.2): https://downloads.mariadb.org/
 +
* Oracle MySQL Community Server (5.6, 5.7): https://dev.mysql.com/downloads/mysql/
  
  gpg --keyserver hkp://keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A
+
Make sure to doublecheck the service is not running (or stop it) after installation as we need to perform some reconfigurations.
  gpg -a --export CD2EFD2A | apt-key add -
+
 
 +
  service mysql stop
 +
 
 +
=== Configuration ===
 +
 
 +
MySQL configuration advise is given in our [[My.cnf|MySQL configuration article]]. Please consult that page for configuration information and create configuration files as described there.
 +
 
 +
Some settings we recommend to change require that the database gets re-initialized. We assume you don't have data there (since we are covering a fresh install) or you have taken a backup for later restore as explained above in the Preparations section.
 +
 
 +
  cd /var/lib/
 +
mv mysql mysql.old.datadir
 +
mkdir mysql
 +
  chown mysql.mysql mysql
 
   
 
   
  cat >/etc/apt/sources.list.d/percona.list <<EOF
+
  # mariadb
  deb http://repo.percona.com/apt wheezy main
+
mysql_install_db
deb-src http://repo.percona.com/apt wheezy main
+
# mariadb 10.2
  EOF
+
mysql_install_db --user=mysql
   
+
# oracle 5.6
  apt-get update
+
mysql_install_db -u mysql
  apt-get install percona-xtradb-cluster-client-5.5 percona-xtradb-cluster-server-5.5 percona-xtrabackup
+
# oracle 5.7
 +
mysqld --initialize-insecure --user=mysql
 +
 
 +
(Don't be worried about the <code>insecure</code>, it just means we set the db root pw in the next steps.)
 +
 
 +
Start the service. The actual command depends on your OS and on the MySQL flavor.
 +
 
 +
service mysql start
 +
 
 +
Run <code>mysql_secure_installation</code> for a "secure by default" installation:
 +
 
 +
  mariadb-secure-installation
 +
 
 +
That tool will ask for the current root password (which is empty by default) and subsequently questions like:
 +
 
 +
Change the root password? [Y/n]
 +
Remove anonymous users? [Y/n]
 +
Disallow root login remotely? [Y/n]
 +
Remove test database and access to it? [Y/n]
 +
Reload privilege tables now? [Y/n]
 +
 
 +
You should answer all these questions with "yes".
 +
 
 +
Configure a strong password for the MySQL <code>root</code> user.
 +
 
 +
The further steps in this guide omit <code>-u -p</code> arguments to the MySQL client. Rather than passing them on the command line [https://dev.mysql.com/doc/refman/5.7/en/password-security-user.html] it is recommended to place the credentials in a file like <code>/root/.my.cnf</code> like
 +
 
 +
[client]
 +
user=root
 +
  password=wip9Phae3Beijeed
 +
 
 +
Make sure the service is enabled by the OS's init system. The actual command depends on your OS and on the MySQL flavor.
 +
 
 +
  systemctl enable mysql.service
 +
 
 +
You should now be able to restore your previously taken backup.
 +
 
 +
  # If you took a dump for restore before
 +
  mysql < backup.sql
 +
 
 +
=== Configure OX to use with a standalone database ===
 +
 
 +
Not much special wisdom here. OX was designed to be used with master/slave databases, and a standalone master works just as well, if we register it as a master, and not registering a slave.
 +
 
 +
For the ConfigDB, <code>configdb.properties</code> allows configuration of a <code>writeUrl</code> (which is set to the correct values if you use <code>oxinstaller</code> with the correct argument <code>--configdb-writehost</code>).
  
==== RHEL 6 systems ====
+
The single database is then used for reading and writing.
  
Should also apply to CentOS 6.
+
For the individiual UserDBs, use <code>registerdatabase -m true</code>.
  
First, disable '''''selinux, iptables, ip6tables'''''. (Galera does not run with selinux. Using iptables and ip6tables should work if you configure it correctly, but documentation thereof is out of scope of this document.) Reboot.
+
== Galera database setup ==
  
Percona XtraDB Cluster relies on '''''socat''''' which is not shipped by RHEL. We need to install from a different source. The ''epel'' repository can be used for that.
+
=== Preparations ===
  
yum install epel-release
+
Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database to Galera cluster, please be prepared to wipe the datadir, i.e. take a <code>mysqldump</code> for later restoration into the properly configured master.
  
The installation command itself needs to be a composite ''remove'', ''install'' command since yum is not clever enough to resolve the conflicts itself, so we need to tell it how.
+
Depeding on the flavor of the current database, this can be something like
  
  wget http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
+
  # mariadb or oracle mysql without GTIDs
  yum localinstall percona-release-0.0-1.x86_64.rpm
+
  mysqldump --databases configdb oxdb_{5..14} > backup.sql
 
   
 
   
  yum shell
+
  # mysql 5.6 with GTIDs... we dont want GTIDs here
remove mysql-libs
+
  mysqldump --databases --set-gtid-purged=OFF configdb oxdb_{5..14} > backup.sql
  install Percona-XtraDB-Cluster-server-55 Percona-XtraDB-Cluster-client-55
 
run
 
quit
 
  
Once this is all done, don't forget to run the update command to get the latest Percona packages.
+
Be sure to verify the list of databases.
  
  yum update
+
=== Installation ===
 +
 
 +
Please follow the upstream docs for your preferred flavor to get the software installed on your system.
 +
 
 +
* Percona XtraDB Cluster (5.6, 5.7): https://www.percona.com/doc/percona-xtradb-cluster/LATEST/install/index.html
 +
* MariaDB Galera Cluster (10.0, 10.1): https://mariadb.com/kb/en/library/getting-started-with-mariadb-galera-cluster/ (Note: with 10.0, socat is required, but not a package dependency, so you need to explicitly install also socat)
 +
 
 +
Make sure to doublecheck the service is not running (or stop it) after installation as we need to perform some reconfigurations.
 +
 
 +
  service mysql stop
  
 
=== Configuration ===
 
=== Configuration ===
  
==== ''my.cnf'' configuration file ====
+
Galera-specific MySQL configuration advise is included in our main [[My.cnf|MySQL configuration article]]. Please consult that page for configuration information.
 +
 
 +
That page suggests a setup were we add three custom config files to <code>/etc/mysql/ox.conf.d/</code>: <code>ox.cnf</code> for general tuning/sizing, <code>wsrep.cnf</code> for clusterwide galera configuration, and <code>host.cnf</code> for host-specific settings.
 +
 
 +
Adjust the general settings and tunings in <code>ox.cnf</code> according to your sizing etc.
 +
 
 +
Adjust <code>wsrep.cnf</code> to reflect local paths, cluster member addresses, etc.
  
Galera needs also a '''my.cnf''' configuration file. Usually MySQL expects this file to be located at '''''/etc/mysql/my.cnf'''''. But the Percona packages don't ship any; on purpose: https://bugs.launchpad.net/percona-server/+bug/673844
+
Adjust <code>host.cnf</code> to give node-local IPs, etc.
  
Thus, you need to obtain / install / create one on your own. Make sure it has no settings which are forbidden for Galera. This includes the ''query_cache'' (it must not be enabled with Galera) and probably other settings which would contradict the settings explained in the next section.
+
Version-specific hints:
  
Default location for my.cnf file based on different Linux Distros.
+
# percona 5.6: unknown variable 'pxc_strict_mode=ENFORCING' ... unset that one
 +
# mariadb 10.1: add wsrep_on=ON
 +
# mariadb 10.0 and 10.1: set wsrep_node_incoming_address=192.168.1.22:3306 in host.cnf, otherwise the status wsrep_incoming_addresses might not be shown correctly(?!)
  
* If you are using Debian Linux file is located at /etc/mysql/my.cnf location
+
Some settings we recommend to change require that the database gets re-initialized. We assume you don't have data there (since we are covering a fresh install) or you have taken a backup for later restore as explained above in the Preparations section.
* If you are using Red Hat Linux/Centos Linux file is located at /etc/my.cnf location
 
  
 +
cd /var/lib/
 +
mv mysql mysql.old.datadir
 +
mkdir mysql
 +
chown mysql.mysql mysql
 +
 +
# mariadb 10.0 and 10.1
 +
mysql_install_db
 +
# mariadb 10.2
 +
mysql_install_db --user=mysql
 +
# percona 5.6
 +
mysqld --user=mysql
 +
# percona 5.7
 +
mysqld --initialize-insecure --user=mysql
  
Make sure you apply standard tunings for your memory size, number of allowed connections, and stuff.
+
(Don't be worried about the <code>insecure</code>, it just means we set the db root pw in the next steps.)
  
We assume in the following that the ''my.cnf'' file has a directive like '''!includedir /etc/mysql/conf.d''', such that you can put additional config files ending with ''.cnf'' there.
+
=== Cluster startup ===
  
A sample ''my.cnf'' file serving as a starting point is provided here: [[My.cnf]]. Make sure you read the whole article and adjust that file to suit your needs before actually using it.
+
Typically on startup a Galera node tries to join a cluster, and if it fails, it will exit. Thus, when no cluster nodes are running, the first cluster node to be started needs to be told to not try to join a cluster, but rather bootstrap a new cluster. The exact arguments vary from version to version and from flavor to flavor.
  
'''Caveat''': we found out that Galera performs suboptimal when using ''innodb_flush_log_at_trx_commit=1''. We leave up to you to assess whether this is a no-go for your environment or not. For Galera reasonable values for this parameter are ''innodb_flush_log_at_trx_commit=2'' or ''...=0''. Make sure to read the documentataion of this parameter and that you understand its implication before using it.
+
==== First node ====
  
Furthermore, you need to set the "datadir" configurable in the ''my.cnf'' file, even if you are on the default and do not want to change it. Some SST methods depend the setting being explicitly present in the configuration file.
+
So we initialize the cluster bootstrap on the first node:
  
==== ''wsrep.cnf'' configuration file ====
+
# percona 5.6, 5.7
 +
service mysql bootstrap-pxc
 +
# mariadb 10.0
 +
service mysql bootstrap
 +
# mariadb 10.1, 10.2
 +
galera_new_cluster
  
The Galera configuration then happens in a section called "wsrep", "write set replication", which is the internal name for the replication mechanism Galera is based on. A sample '''''/etc/mysql/conf.d/wsrep.cnf''''' file looks like:
+
Run <code>mysql_secure_installation</code> for a "secure by default" installation:
  
  [mysqld]
+
  mysql_secure_installation
# the following lines are required for galera:
 
binlog_format=ROW
 
default-storage-engine=innodb
 
innodb_autoinc_lock_mode=2
 
innodb_locks_unsafe_for_binlog=1
 
query_cache_size=0
 
query_cache_type=0
 
bind-address=0.0.0.0
 
wsrep_provider=/usr/lib64/libgalera_smm.so
 
# NOTE: on Wheezy, use this path:
 
# wsrep_provider=/usr/lib/libgalera_smm.so
 
# the following lines need to be adjusted to your environment ... CHANGE THE PASSWORD! :-)
 
wsrep_cluster_name="my_wsrep_cluster"
 
wsrep_cluster_address="gcomm://<GALERA_NODE1_IP>,<GALERA_NODE2_IP>,<GALERA_NODE3_IP>"
 
wsrep_sst_method=xtrabackup
 
wsrep_sst_auth=wsrep:5ojijmedUg8
 
# It is recommended to run Galera in synchronous mode, which makes it possible
 
# to disable the OX builtin database replication monitor.
 
# Default is semi-synchronous mode. To enable synchronous mode, use
 
wsrep_causal_reads=1
 
  
When you adjusted those files, make sure they are identical on all nodes.
+
The further steps in this guide omit <code>-u -p</code> arguments to the MySQL client. Rather than passing them on the command line [https://dev.mysql.com/doc/refman/5.7/en/password-security-user.html] it is recommended to place the credentials in a file like <code>/root/.my.cnf</code> like
  
The replication user will be created later when the DB is running on the first node.
+
[client]
 +
user=root
 +
password=wip9Phae3Beijeed
  
=== Cluster startup ===
+
We need a Galera replication user:
  
Whenever not all nodes of a Galera cluster are running (like before starting the cluster for the very first time), the first Galera node needs to get started with the ''wsrep_cluster_address'' parameter overridden to the value "gcomm://" in order to denote that the node shall not try to join an existing cluster (which would inevitably fail now, because no other cluster nodes are running yet), but to bootstrap the cluster instead. This override can most conveniently done on the command line, instead of editing to wsrep.cnf file to and fro.
+
CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'OpIdjijwef0';
 +
-- percona 5.6, mariadb 10.0
 +
GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
 +
-- percona 5.7, mariadb 10.1, 10.2
 +
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
 +
FLUSH PRIVILEGES;
  
So, for the first node, the startup command is
+
(Debian specific note: MariaDB provided startup scripts use the distro's mechanism of verifying startup/shutdown using a system user, so we create that as well:
  
  mysqld_safe --wsrep_cluster_address=gcomm:// &
+
  # mariadb 10.0, 10.1, 10.2
 +
GRANT ALL PRIVILEGES ON *.* TO "debian-sys-maint"@"localhost" IDENTIFIED BY "adBexthTsI5TaEps";
  
You should then verify the Galera module is loaded properly using
+
If you do this, yo need to synchronize the <code>/etc/mysql/debian.cnf</code> file from the first node to the other nodes as well.)
  
mysql -e "show status like 'wsrep%';"
+
==== Other nodes ====
  
You should verify some settings like
+
On the other nodes, we only need to restart the service now, to trigger a full state transfer from the first node to the other nodes.
  
| wsrep_local_state_comment  | Synced                              |
+
We recommend to do this serially to let one state transfer complete before the second state transfer.
| wsrep_cluster_size        | 1                                    |
 
| wsrep_cluster_status      | Primary                              |
 
| wsrep_connected            | ON                                  |
 
| wsrep_provider_name        | Galera                              |
 
| wsrep_provider_vendor      | Codership Oy <info@codership.com>    |
 
| wsrep_provider_version    | 2.8(r162)                            |
 
| wsrep_ready                | ON                                  |
 
  
Now you need to create the database user (we will use the same username and password as we defined in the previous section when setting up wsrep.cnf file) for the replication on this first node:
+
==== First node (continued) ====
  
# create wsrep user: in mysql shell:
+
Only applicable if you used <code>galera_new_cluster</code> before rather than the service script: In order to get the systemctl status consistent, restart the service on the first node:
CREATE USER 'wsrep'@'localhost' IDENTIFIED BY '5ojijmedUg8';
 
GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'wsrep'@'localhost';
 
FLUSH PRIVILEGES;
 
  
The Galera peers can then be started on the nodes 2 and 3 using
+
# mariadb 10.1, 10.2: restart the service so that the systemctl status is consistent
 +
mysqladmin shutdown
 +
service mysql bootstrap
  
mysqld_safe &
+
=== Verify the replication ===
  
Since the standard service startup scripts cannot account for this special treatment, we recomment not to use them.
+
The key tool to verify replication status is
  
You can check the status of the Galera cluster using
+
mysql> show status like "%wsrep%";
  
mysql -e "show status like 'wsrep%';"
+
This will give a lot of output. You want to verify in particular
  
The output is lengthy. The most relevant fields are given as follows:
+
+------------------------------+--------------------------------------+
 +
| Variable_name                | Value                                |
 +
+------------------------------+--------------------------------------+
 +
| wsrep_cluster_size          | 3                                    |
 +
| wsrep_cluster_status        | Primary                              |
 +
| wsrep_local_state            | 4                                    |
 +
| wsrep_local_state_comment    | Synced                              |
 +
| wsrep_ready                  | ON                                  |
 +
+------------------------------+--------------------------------------+
  
+----------------------------+----------------------------------------------------------------------+
+
You can also explicitly verify replication by creating / inserting DBs, tables, rows on one node and select on other nodes.
| Variable_name              | Value                                                                |
 
+----------------------------+----------------------------------------------------------------------+
 
| wsrep_local_state_comment  | Synced                                                              |
 
| wsrep_incoming_addresses  | <GALERA_NODE1_IP>:3306,<GALERA_NODE2_IP>:3306,<GALERA_NODE3_IP>:3306 |
 
| wsrep_cluster_size        | 3                                                                    |
 
| wsrep_cluster_status      | Primary                                                              |
 
| wsrep_connected            | ON                                                                  |
 
| wsrep_ready                | ON                                                                  |
 
+----------------------------+----------------------------------------------------------------------+
 
  
 
==== Troubleshooting ====
 
==== Troubleshooting ====
Line 170: Line 243:
 
If the first node starts, but the second / third nodes can not be added to the cluster:
 
If the first node starts, but the second / third nodes can not be added to the cluster:
 
* User for the replication not created correctly on the first Galera node
 
* User for the replication not created correctly on the first Galera node
 +
* SST fails due to missing / wrong version prerequisite packages (not everything is hardcoded in package dependencies -- make sure you got percona-xtrabackup installed in the correct version, and also socat). If SST fails, do not only look into mysqls primary error logs, but also into logfiles from the SST tool in /var/lib/mysql on the donor node.
  
 
=== Notes about configuring OX for use with Galera ===
 
=== Notes about configuring OX for use with Galera ===
Line 175: Line 249:
 
==== Write requests ====
 
==== Write requests ====
  
Open-Xchange supports Galera as database backend only in the configuration where all writes are directed to one Galera node. For availability, it makes sense to configure a floating IP as write IP for OX, which can be transferred from one Galera to another Galera node if needed. We recommend putting the control of such a floating IP under the control of some HA cluster software like pacemaker/corosync.
+
Open-Xchange supports Galera as database backend only in the configuration where all writes are directed to one Galera node. For availability, it makes sense to not configure one Galera node's IP address directly, but rather employ some HA solution which offers active-passive functionality. Options therefore are discussed below.
  
 
==== Read requests ====
 
==== Read requests ====
  
You can chose between using a load balancer, using a floating IP for the read requests, or use the write requests floating IP.
+
Read requests can be directed to any node in the Galera cluster. Our standard approach is to recommend to use a loadbalancer to implement round-robin over all nodes in a Galera cluster for the read requests. But you can also chose to use a dedicated read node (the same node, or a different node, than the write node). Each of the approaches has its own advantages.
  
* Load balancer based setup: Read requests get distributed aribtrarily between the Galera nodes. You require a load balancer for this. See next section. Then, in this configuration, you have two alternatives:  
+
* Load balancer based setup: Read requests get distributed round-robin between the Galera nodes. Theoretically by distributing the load of the read requests, you benefit from lower latencies and more throughput. But this has never been benchmarked yet. For a discussion of available loadbalances, see next section. OX-wise, in this configuration, you have two alternatives:  
** The Galera option wsrep_causal_reads=1 option enables you to configure OX with its replication monitor disabled (com.openexchange.database.replicationMonitor=false in configdb.properties).
+
** The Galera option wsrep_causal_reads=1 option enables you to configure OX with its replication monitor disabled (com.openexchange.database.replicationMonitor=false in configdb.properties). This is the setup which seems to perform best according to our experience as turning off the replication monitor reduces the commits on the DB and thus the write operations per second on the underlying storage significantly, which outweights the drawback from having higher commit latency due to fully synchronous mode.
** Alternatively, you can run Galera with wsrep_causal_reads=0 when switching on OX builtin replication monitor. This setup however seems to be inferior from the performance point of view.
+
** Alternatively, you can run Galera with wsrep_causal_reads=0 when switching on OX builtin replication monitor. This is also a valid setup.
* Use a designated floating IP for the read requests: This eliminates the need of a load balancer. Since read requests do not seem to be performance-limiting, we expect to perform this option on par with the loadbalancer option.
+
* Use a designated floating IP for the read requests: This eliminates the need of a load balancer. With this option you will not gain any performance, but the quantitative benefit is unclear anyhow.
* Use the floating IP for the writes also for the reads: In this scenario, you direct all database queries only to one Galera node, and the other two nodes are only getting queries in case of a failure of that node. In this case, you can even use wsrep_causal_reads=0 wile still having OX builtin replication monitor switched off.
+
* Use the floating IP for the writes also for the reads: In this scenario, you direct all database queries only to one Galera node, and the other two nodes are only getting queries in case of a failure of that node. In this case, you can even use wsrep_causal_reads=0 while still having OX builtin replication monitor switched off. However we do not expect this option to be superior to the round-robin loadbalancer approach.
  
 
=== Loadbalancer options ===
 
=== Loadbalancer options ===
Line 191: Line 265:
 
While the JDBC driver has some round-robin load balancing capabilities built-in, we don't recommend it for production use since it lacks possibilities to check the Galera nodes health states.
 
While the JDBC driver has some round-robin load balancing capabilities built-in, we don't recommend it for production use since it lacks possibilities to check the Galera nodes health states.
  
For most productions setup customers use enterprise-grade loadbalancing appliances. Those should get configured to check node availability not only on the TCP level, but to query the Galera sync status periodically. For an example of such an health check, see the our documentation for setting up a software loadbalancer using keepalived (next paragraph).
+
Loadbalancers used for OX -> Galera loadbalancing should be able to implement active-passive instances for the write requests, and active-active (round-robin) instances for the read requests. (If they cannot implement active-passive, you can still take a floating IP therefore.) Furthermore it is required to configure node health checks not only on the TCP level (by a simple connect), but to query the Galera health status periodically, evaluating Galera WSREP status variables. Otherwise split-brain scenarios or other bad states cannot be detected. For an example of such an health check, see our [[Clustercheck]] page.
  
It is possible to use Keepalived to create a software loadbalancer on a therefore dedicated linux node. Documentation is available [[Keepalived|here]]. This implements the round-robin part for the read requests. For the single-node write behavior, we recommend to use a floating IP then.
+
Some customers use loadbalancing appliances. It is important to check that if the (virtual) infrastructure offers "loadbalancer" instances that they satisfy the given requirements. Often this is not the case. In particular, a simple "DNS round robin" approach is not viable.
  
In case where the Keepalived based approach is not feasible due to its requirements on the infrastructure, it is also possible to use a HAproxy based solution where HAproxy processes run on each of the OX nodes, configured for one round-robin and one active/passive instance. OX is then connecting to the local HAproxy instances. It is vital to configure HAproxy timeouts different from the defaults, otherwise HAproxy will kill active DB connections, causing errors. Some configuration hints for HAproxy are available [[HAproxy|here]].
+
==== LVS/ipvsadm/keepalived ====
  
== Master/Master database setup ==
+
If you want to create your own loadbalancers based on Linux, we usually recommend LVS (Linux Virtual Servers) controlled by Keepalived. LVS is a set of kernel modules implementing a L4 loadbalancer which performs quite well. Keepalived is a userspace daemon to control LVS rules, using health checks to reconfigure LVS rules if required. Keepalived / LVS requires one (or, for availability, two) dedicated linux nodes to run on. This can be a disadvantage for some installations, but usually, it pays off. We provide some configuration information on Keepalived [[Keepalived|here]].
This section describes the setup process "Master/Master replication" for new Open-Xchange  database cluster. During configuration and initialization, other database operations must be prohibited.
 
  
The Master/Master replication is a vice versa setup of Master/Slave configurations. This means each server is afterwards the slave of the other.
+
==== MariaDB Maxscale ====
  
Server IPs in the example are 1.1.1.1 and 9.9.9.9
+
Since Maxscale has become GA in 2015, it seems to have undergone significant stability, performance and functional improvements. We are currently experimenting with Maxscale and share our installation / configuration knowledge [[Maxscale|here]]. It looks quite promising and might become ''the standard replacement'' for HAproxy, while we still presume Keepalived offers superior robustness and performance, coming with the cost of the requirement for one (or more) dedicated loadbalancer nodes.
  
Startup both database machines and install the mysql server packages
+
==== HAproxy ====
$ apt-get install mysql-server
 
  
During the installation, a dialog will show up to set a password for the MySQL 'root' user.
+
In case where the Keepalived based approach is not feasible due to its requirements on the infrastructure, it is also possible to use a HAproxy based solution where HAproxy processes run on each of the OX nodes, configured for one round-robin and one active/passive instance. OX is then connecting to the local HAproxy instances. It is vital to configure HAproxy timeouts different from the defaults, otherwise HAproxy will kill active DB connections, causing errors. Be aware that in large installations the number of (distributed) HAproxy instances can get quite large. Some configuration hints for HAproxy are available [[HAproxy|here]].
  
Open the MySQL configuration file on both servers:
+
== Master/Slave database setup ==
$ vim /etc/mysql/my.cnf
 
  
Modify or enable the following configuration options in the mysqld-section, use 1 as ${unique Number} on the server 1.1.1.1 and 2 for 9.9.9.9:
+
While we also support also "legacy" (pre-GTID) Master/Slave replication, we recommend to use GTID based replication, for easier setup and failure recovery. Support for GTID based replication has been added with OX 7.8.0.
bind-address            = 0.0.0.0
 
server-id              = ${unique Number}
 
log_bin                = /var/log/mysql/mysql-bin.log
 
binlog_format          = statement
 
max_allowed_packet      = 16M
 
  
* ''bindaddress'' specifies the network address where MySQL is listening for network connections. Since the MySQL slave and both Open-Xchange Servers are dedicated machines it is required to have the master accessible through the network.
+
GTID has been available since MySQL 5.6, so no 5.5 installation instructions below, sorry. We try to be generic in this documentation (thus, applicable to Oracle Community Edition and MariaDB) and point out differences where needed. Note: Instructions below include information about Oracle Community MySQL 5.7 which is not yet formally supported.
* ''server-id'' is just a unique number within a environment with multiple MySQL servers. It needs to be unique for each server in a replication cluster.
 
* ''log-bin'' enables the MySQL binary log which is required for Master/Master replication. In general every statement triggered at the database is stored there to get distributed through the database cluster.
 
  
To apply the configuration changes, restart the MySQL servers.
+
=== Preparations ===
$ /etc/init.d/mysql restart
 
  
Then login to MySQL with the credentials given at the MySQL installation process
+
Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database to GTID master-slave, please be prepared to wipe the datadir, i.e. take a <code>mysqldump</code> for later restoration into the properly configured master.
$ mysql -u root -p
 
Enter password:
 
  
=== First Master configuration ===
+
Depeding on the flavor of the current database, this can be something like
Choose one server to start with as the first Master (here we use 1.1.1.1).
 
  
Create a MySQL user with rights "REPLICATION". This account is used by the MySQL slave to fetch database updates. In this example, the username is "replication":
+
# mariadb or oracle mysql without GTIDs
 +
mysqldump --databases configdb oxdb_{5..14} > backup.sql
 +
 +
# mysql 5.6 with GTIDs... we dont want GTIDs here
 +
mysqldump --databases --set-gtid-purged=OFF configdb oxdb_{5..14} > backup.sql
 +
 
 +
Be sure to verify the list of databases.
 +
 
 +
=== Installation ===
 +
 
 +
Software installation is identical for master and slave.
 +
 
 +
Please follow the instructions for installing from The vendors.
 +
 
 +
* Oracle Community Edition: https://dev.mysql.com/doc/mysql-apt-repo-quick-guide/en/
 +
* MariaDB (10.0, 10.1): https://downloads.mariadb.org/mariadb/repositories/
 +
 
 +
Stop the service (if it is running):
 +
 
 +
service mysql stop
 +
 
 +
=== Configuration ===
 +
 
 +
Configuration as per configuration files is also identical for master and slave.
 +
 
 +
Consult [[My.cnf]] for general recommendations how to configure databases for usage with OX.
 +
 
 +
For GTID based replication, make sure you add some configurables to a new <code>/etc/mysql/ox.conf.d/gtid.cnf</code> file (assuming you are following our proposed schema of adding a <code>!includedir /etc/mysql/ox.conf.d/</code>" directive to <code>/etc/mysql/my.cnf</code>):
 +
 
 +
# GTID
 +
log-bin=mysql-bin
 +
server-id=...
 +
log_slave_updates = ON
 +
 
 +
Oracle Community Edition: we need to add also
 +
 
 +
enforce_gtid_consistency = ON
 +
gtid_mode = ON
 +
 
 +
(GTID mode is on by default on MariaDB.)
 +
 
 +
Use unique a <code>server-id</code> for each server; like <code>1</code> for the master, <code>2</code> for slave. For more complicated setups (like multiple slaves), adjust accordingly.
 +
 
 +
Since applying our configuration / sizing requires reinitialization of the MySQL datadir, we wipe/recreate it. Caution: this assumes we are running an empty database. If there is data in the database you want to keep, use mysqldump. See Preparation section above.
 +
 
 +
So, to initialize the datadir:
 +
 
 +
cd /var/lib/
 +
mv mysql mysql.old.datadir
 +
mkdir mysql
 +
chown mysql.mysql mysql
  
  mysql> GRANT REPLICATION SLAVE ON *.* TO 'replication'@'9.9.9.9' IDENTIFIED BY 'secret';
+
(When coming from an existing installation, be sure to wipe also old binlogs. They can confuse the server on startup. Their location varies by configuration.)
  
Verify that the MySQL daemon writes a binary log and note the log Position and File name:
+
The step to initialize the datadir is different for the different DBs:
mysql> SHOW MASTER STATUS;
 
+------------------+----------+--------------+------------------+
 
| File            | Position | Binlog_Do_DB | Binlog_Ignore_DB |
 
+------------------+----------+--------------+------------------+
 
| mysql-bin.000001 |    1111 |              |                  |
 
+------------------+----------+--------------+------------------+
 
  
=== First Slave configuration ===
+
# MariaDB 10.0, 10.1
 +
mysql_install_db
 +
 +
# MariaDB 10.2
 +
mysql_install_db --user=mysql
 +
 +
# Oracle 5.6
 +
mysql_install_db -u mysql
 +
 +
# Oracle 5.7
 +
mysqld --initialize-insecure --user=mysql
  
On 9.9.9.9, set the MySQL system user as owner of the binary log that has just been copied to the slave.
+
(Don't be worried about the <code>insecure</code>, it just means we set the db root pw in the next steps.)
$ chown mysql:adm /var/log/mysql/*
 
  
Configure MySQL on 9.9.9.9 to use 1.1.1.1 as Master Server. (Use the actual log File name and Position which you just obtained with the command SHOW MASTER STATUS  on 1.1.1.1. as explained above.)
+
Then:
mysql> CHANGE MASTER TO MASTER_HOST='1.1.1.1', MASTER_USER='replication', MASTER_PASSWORD='secret', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=1111;
 
  
Start the MySQL slave replication
+
service mysql restart
  mysql> START SLAVE;
+
  mysql_secure_installation
  
And check the status
+
We want to emphasize the last step to run "secure".
mysql> SHOW SLAVE STATUS\G;
 
  
"Slave_IO_Running" and "Slave_SQL_Running" should be set to "yes". Furthermore "Read_Master_Log_Pos" should be counting and "Seconds_Behind_Master" should be approaching the 0 mark.
+
Steps up to here apply to both the designated master and slave. The next steps will apply to the master.
  
=== Second Master configuration ===
+
=== Replication Setup ===
  
This means, the first Master/Slave Replication is working and the "reverse" replication needs to be prepared. Please now create the replication user on 9.9.9.9:
+
==== Master Setup ====
  
  mysql> GRANT REPLICATION SLAVE ON *.* TO 'replication'@'1.1.1.1' IDENTIFIED BY 'secret';
+
Create a replication user on the master (but, as always, pick your own password, and use the same password in the slave setup below):
  
Verify that the MySQL daemon writes a binary log and remember the log Position:
+
  mysql -e "CREATE USER 'repl'@'gtid-slave.localdomain' IDENTIFIED BY 'IvIjyoffod2'; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'gtid-slave.localdomain';"
  mysql> SHOW MASTER STATUS;
 
+------------------+----------+--------------+------------------+
 
| File            | Position | Binlog_Do_DB | Binlog_Ignore_DB |
 
+------------------+----------+--------------+------------------+
 
| mysql-bin.000009 |      9999|              |                  |
 
+------------------+----------+--------------+------------------+
 
  
=== Second Slave configuration ===
+
Now would also be the time to restore a previously created mysqldump, or add other users you need for adminstration, monitoring etc (like <code>debian-sys-maint@localhost</code>, for example). Adding the OX users is explained below ("Creating Open-Xchange user").
  
1.1.1.1 is now the slave in this context and 9.9.9.9 is the master. Log in to 1.1.1.1
+
# If you took a dump for restore before
 +
mysql < backup.sql
  
Configure MySQL on 1.1.1.1 to use 9.9.9.9 as Master Server. Use the remembered log and file position from 1.1.1.1.
+
To prepare for the initial sync of the slave, set the master read-only:
mysql> CHANGE MASTER TO MASTER_HOST='9.9.9.9', MASTER_USER='replication', MASTER_PASSWORD='secret', MASTER_LOG_FILE='mysql-bin.000009', MASTER_LOG_POS=9999;
 
  
start the MySQL slave replication
+
  mysql -e "SET @@global.read_only = ON;"
  mysql> START SLAVE;
 
  
and check the status
+
Create a dump to initialize the slave:
mysql> SHOW SLAVE STATUS\G;
 
  
"Slave_IO_Running" and "Slave_SQL_Running" should be set to "yes". Furthermore "Read_Master_Log_Pos" should be counting and "Seconds_Behind_Master" should be approaching the 0 mark.
+
# MariaDB
 +
mysqldump --all-databases --triggers --routines --events --master-data --gtid > master.sql
 +
 +
  # Oracle
 +
mysqldump --all-databases --triggers --routines --events --set-gtid-purged=ON > master.sql
  
Also check the syslog if the replication has been sucessfully started
+
Transfer to the slave:
$ tail -fn20 /var/log/syslog
 
Jul 26 19:03:45 dbslave mysqld[4718]: 090726 19:03:45 [Note] Slave I/O thread: connected to master 'replication@1.1.1.17:3306',  replication started in log 'mysql-bin.000001' at position 10000
 
  
=== Testing Master/Master ===
+
scp master.sql gtid-slave:
  
On 1.1.1.1, create a new database in MySQL:
+
==== Slave Setup ====
mysql> CREATE DATABASE foo;
 
  
Verify the database to als be available on 9.9.9.9 afterwards:
+
Configure the replication master settings. Note we don't need complicated binlog position settings etc with GTID.
mysql> SHOW DATABASES;
 
+--------------------+
 
| Database          |
 
+--------------------+
 
| information_schema |
 
| foo                |
 
| mysql              |
 
+--------------------+
 
  
Delete the new database on 9.9.9.9:
+
Yet again DB-specific (use the repl user password from above):
mysql> DROP DATABASE foo;
 
  
Check if the database has also been removed on 1.1.1.1
+
# MariaDB
  mysql> SHOW DATABASES;
+
  mysql -e 'CHANGE MASTER TO MASTER_HOST="gtid-master.localdomain", MASTER_USER="repl", MASTER_PASSWORD="IvIjyoffod2";'
  +--------------------+
+
   
  | Database          |
+
  # Oracle
  +--------------------+
+
  mysql -e "CHANGE MASTER TO MASTER_HOST='gtid-master.localdomain', MASTER_USER='repl', MASTER_PASSWORD='IvIjyoffod2', MASTER_AUTO_POSITION=1;"
  | information_schema |
+
# https://www.percona.com/blog/2013/02/08/how-to-createrestore-a-slave-using-gtid-replication-in-mysql-5-6/
  | mysql             |
+
mysql -e "RESET MASTER;"
  +--------------------+
+
 
 +
Read the master dump:
 +
 
 +
  mysql < master.sql
 +
 
 +
Start replication on the slave:
 +
 
 +
  mysql -e 'START SLAVE;'
 +
  mysql -e 'SHOW SLAVE STATUS\G'
 +
 
 +
==== Master Setup (continued) ====
 +
 
 +
Finally, unset read-only on the master:
 +
 
 +
# on the master
 +
mysql -e "SET @@global.read_only = OFF;"
 +
 
 +
=== Configure OX to use with Master/Slave replication ===
 +
 
 +
Not much special wisdom here. OX was designed to be used with master/slave databases. For the ConfigDB, <code>configdb.properties</code> allows configuration of a <code>readUrl</code> and <code>writeUrl</code> (both of which are set to the correct values if you use <code>oxinstaller</code> with the correct arguments <code>--configdb-readhost</code>, <code>--configdb-writehost</code>).
 +
 
 +
(Obviously, the master is for writing and the slave is for reading.)
 +
 
 +
For the individiual UserDBs, use <code>registerdatabase -m true</code> for the masters and <code>registerdatabase -m false -M ...</code> for the respective slaves.
 +
 
 +
Be sure to have enabled the replication monitor in <code>configdb.properties</code>: <code>com.openexchange.database.replicationMonitor=true</code> (which it is by default); while GTID can show synchronous semantics, it is specified to silently fall back to asynchronous in certain circumstances, so synchronity is not guaranteed.
 +
 
 +
We recommend, though, to not register the databases directly by their native hostname or IP, but rather use some kind of HA system in order to be able to easily move a floating/failover IP from the master to the slave in case of master failure. Configuring and running such systems (like, corosync/pacemaker, keepalived, or whatever) is out of scope of this documentation, however.
  
 
== Creating Open-Xchange user ==
 
== Creating Open-Xchange user ==
Line 325: Line 447:
 
Now setup access for the Open-Xchange Server database user 'openexchange' to configdb and the oxdb for both groupware server addresses. These databases do not exist yet, but will be created during the Open-Xchange Server installation.
 
Now setup access for the Open-Xchange Server database user 'openexchange' to configdb and the oxdb for both groupware server addresses. These databases do not exist yet, but will be created during the Open-Xchange Server installation.
  
Note: The IPs in this example belong to the two different Open-Xchange Servers, please adjust them accordingly.
+
Notes:
  mysql> GRANT ALL PRIVILEGES ON *.* TO 'openexchange'@'10.20.30.213' IDENTIFIED BY 'secret';
+
 
  mysql> GRANT ALL PRIVILEGES ON *.* TO 'openexchange'@'10.20.30.215' IDENTIFIED BY 'secret';
+
* Please use a real password.
 +
* The IPs in this example belong to the two different Open-Xchange Servers, please adjust them accordingly.
 +
* If using a database on the same host as the middlware (usually done for POCs and demo installations), you need to grant also to the ''localhost'' host.
 +
* Consult [[AppSuite:DB_user_privileges]] (or ''grep GRANT /opt/open-xchange/sbin/initconfigdb'') for an up-to-date list of required privileges. The following statement was correct as of the time of writing this section.
 +
 
 +
  mysql> GRANT CREATE, LOCK TABLES, REFERENCES, INDEX, DROP, DELETE, ALTER, SELECT, UPDATE, INSERT, CREATE TEMPORARY TABLES, SHOW VIEW, SHOW DATABASES ON *.* TO 'openexchange'@'10.20.30.213' IDENTIFIED BY 'IntyoyntOat1' WITH GRANT OPTION;
 +
  mysql> GRANT CREATE, LOCK TABLES, REFERENCES, INDEX, DROP, DELETE, ALTER, SELECT, UPDATE, INSERT, CREATE TEMPORARY TABLES, SHOW VIEW, SHOW DATABASES ON *.* TO 'openexchange'@'10.20.30.215' IDENTIFIED BY 'IntyoyntOat1' WITH GRANT OPTION;

Latest revision as of 09:03, 16 September 2021

Overview

You can choose between Galera or Master/Slave replication. We like to recommend to use Galera for higher redudancy, easier operations, und synchronous semantics (so you can run OX without our "replication monitor"). For POC or demo setups, a single standalone database setup might be sufficient.

Standalone database setup

Preparations

Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database server, please be prepared to wipe the datadir, i.e. take a mysqldump for later restoration into the properly configured master.

mysqldump --databases configdb oxdb_{5..14} > backup.sql

Be sure to verify the list of databases.

Installation

Note: the following list is not an exclusive list or authorative statement about supported MySQL flavors / versions. Please consult the official support / system requirements statement.

Please follow the upstream docs for your preferred flavor to get the software installed on your system.

Make sure to doublecheck the service is not running (or stop it) after installation as we need to perform some reconfigurations.

service mysql stop

Configuration

MySQL configuration advise is given in our MySQL configuration article. Please consult that page for configuration information and create configuration files as described there.

Some settings we recommend to change require that the database gets re-initialized. We assume you don't have data there (since we are covering a fresh install) or you have taken a backup for later restore as explained above in the Preparations section.

cd /var/lib/
mv mysql mysql.old.datadir
mkdir mysql
chown mysql.mysql mysql

# mariadb
mysql_install_db
# mariadb 10.2
mysql_install_db --user=mysql
# oracle 5.6
mysql_install_db -u mysql
# oracle 5.7
mysqld --initialize-insecure --user=mysql

(Don't be worried about the insecure, it just means we set the db root pw in the next steps.)

Start the service. The actual command depends on your OS and on the MySQL flavor.

service mysql start

Run mysql_secure_installation for a "secure by default" installation:

mariadb-secure-installation

That tool will ask for the current root password (which is empty by default) and subsequently questions like:

Change the root password? [Y/n]
Remove anonymous users? [Y/n]
Disallow root login remotely? [Y/n]
Remove test database and access to it? [Y/n]
Reload privilege tables now? [Y/n]

You should answer all these questions with "yes".

Configure a strong password for the MySQL root user.

The further steps in this guide omit -u -p arguments to the MySQL client. Rather than passing them on the command line [1] it is recommended to place the credentials in a file like /root/.my.cnf like

[client]
user=root
password=wip9Phae3Beijeed

Make sure the service is enabled by the OS's init system. The actual command depends on your OS and on the MySQL flavor.

systemctl enable mysql.service

You should now be able to restore your previously taken backup.

# If you took a dump for restore before
mysql < backup.sql

Configure OX to use with a standalone database

Not much special wisdom here. OX was designed to be used with master/slave databases, and a standalone master works just as well, if we register it as a master, and not registering a slave.

For the ConfigDB, configdb.properties allows configuration of a writeUrl (which is set to the correct values if you use oxinstaller with the correct argument --configdb-writehost).

The single database is then used for reading and writing.

For the individiual UserDBs, use registerdatabase -m true.

Galera database setup

Preparations

Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database to Galera cluster, please be prepared to wipe the datadir, i.e. take a mysqldump for later restoration into the properly configured master.

Depeding on the flavor of the current database, this can be something like

# mariadb or oracle mysql without GTIDs
mysqldump --databases configdb oxdb_{5..14} > backup.sql

# mysql 5.6 with GTIDs... we dont want GTIDs here
mysqldump --databases --set-gtid-purged=OFF configdb oxdb_{5..14} > backup.sql

Be sure to verify the list of databases.

Installation

Please follow the upstream docs for your preferred flavor to get the software installed on your system.

Make sure to doublecheck the service is not running (or stop it) after installation as we need to perform some reconfigurations.

service mysql stop

Configuration

Galera-specific MySQL configuration advise is included in our main MySQL configuration article. Please consult that page for configuration information.

That page suggests a setup were we add three custom config files to /etc/mysql/ox.conf.d/: ox.cnf for general tuning/sizing, wsrep.cnf for clusterwide galera configuration, and host.cnf for host-specific settings.

Adjust the general settings and tunings in ox.cnf according to your sizing etc.

Adjust wsrep.cnf to reflect local paths, cluster member addresses, etc.

Adjust host.cnf to give node-local IPs, etc.

Version-specific hints:

# percona 5.6: unknown variable 'pxc_strict_mode=ENFORCING' ... unset that one
# mariadb 10.1: add wsrep_on=ON
# mariadb 10.0 and 10.1: set wsrep_node_incoming_address=192.168.1.22:3306 in host.cnf, otherwise the status wsrep_incoming_addresses might not be shown correctly(?!)

Some settings we recommend to change require that the database gets re-initialized. We assume you don't have data there (since we are covering a fresh install) or you have taken a backup for later restore as explained above in the Preparations section.

cd /var/lib/
mv mysql mysql.old.datadir
mkdir mysql
chown mysql.mysql mysql

# mariadb 10.0 and 10.1
mysql_install_db
# mariadb 10.2
mysql_install_db --user=mysql
# percona 5.6
mysqld --user=mysql
# percona 5.7
mysqld --initialize-insecure --user=mysql

(Don't be worried about the insecure, it just means we set the db root pw in the next steps.)

Cluster startup

Typically on startup a Galera node tries to join a cluster, and if it fails, it will exit. Thus, when no cluster nodes are running, the first cluster node to be started needs to be told to not try to join a cluster, but rather bootstrap a new cluster. The exact arguments vary from version to version and from flavor to flavor.

First node

So we initialize the cluster bootstrap on the first node:

# percona 5.6, 5.7
service mysql bootstrap-pxc
# mariadb 10.0
service mysql bootstrap
# mariadb 10.1, 10.2
galera_new_cluster

Run mysql_secure_installation for a "secure by default" installation:

mysql_secure_installation

The further steps in this guide omit -u -p arguments to the MySQL client. Rather than passing them on the command line [2] it is recommended to place the credentials in a file like /root/.my.cnf like

[client]
user=root
password=wip9Phae3Beijeed

We need a Galera replication user:

CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'OpIdjijwef0';
-- percona 5.6, mariadb 10.0
GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
-- percona 5.7, mariadb 10.1, 10.2
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
FLUSH PRIVILEGES;

(Debian specific note: MariaDB provided startup scripts use the distro's mechanism of verifying startup/shutdown using a system user, so we create that as well:

# mariadb 10.0, 10.1, 10.2
GRANT ALL PRIVILEGES ON *.* TO "debian-sys-maint"@"localhost" IDENTIFIED BY "adBexthTsI5TaEps";

If you do this, yo need to synchronize the /etc/mysql/debian.cnf file from the first node to the other nodes as well.)

Other nodes

On the other nodes, we only need to restart the service now, to trigger a full state transfer from the first node to the other nodes.

We recommend to do this serially to let one state transfer complete before the second state transfer.

First node (continued)

Only applicable if you used galera_new_cluster before rather than the service script: In order to get the systemctl status consistent, restart the service on the first node:

# mariadb 10.1, 10.2: restart the service so that the systemctl status is consistent
mysqladmin shutdown
service mysql bootstrap

Verify the replication

The key tool to verify replication status is

mysql> show status like "%wsrep%";

This will give a lot of output. You want to verify in particular

+------------------------------+--------------------------------------+
| Variable_name                | Value                                |
+------------------------------+--------------------------------------+
| wsrep_cluster_size           | 3                                    |
| wsrep_cluster_status         | Primary                              |
| wsrep_local_state            | 4                                    |
| wsrep_local_state_comment    | Synced                               |
| wsrep_ready                  | ON                                   |
+------------------------------+--------------------------------------+

You can also explicitly verify replication by creating / inserting DBs, tables, rows on one node and select on other nodes.

Troubleshooting

The logs are helpful. Always.

Common mistakes are listed below.

If the Galera module does not get loaded at all:

  • Configuration settings in my.cnf which are incompatible to Galera
  • Wrong path of the shared object providing the Galera plugin in wsrep.cnf (wsrep_provider)

If the first node starts, but the second / third nodes can not be added to the cluster:

  • User for the replication not created correctly on the first Galera node
  • SST fails due to missing / wrong version prerequisite packages (not everything is hardcoded in package dependencies -- make sure you got percona-xtrabackup installed in the correct version, and also socat). If SST fails, do not only look into mysqls primary error logs, but also into logfiles from the SST tool in /var/lib/mysql on the donor node.

Notes about configuring OX for use with Galera

Write requests

Open-Xchange supports Galera as database backend only in the configuration where all writes are directed to one Galera node. For availability, it makes sense to not configure one Galera node's IP address directly, but rather employ some HA solution which offers active-passive functionality. Options therefore are discussed below.

Read requests

Read requests can be directed to any node in the Galera cluster. Our standard approach is to recommend to use a loadbalancer to implement round-robin over all nodes in a Galera cluster for the read requests. But you can also chose to use a dedicated read node (the same node, or a different node, than the write node). Each of the approaches has its own advantages.

  • Load balancer based setup: Read requests get distributed round-robin between the Galera nodes. Theoretically by distributing the load of the read requests, you benefit from lower latencies and more throughput. But this has never been benchmarked yet. For a discussion of available loadbalances, see next section. OX-wise, in this configuration, you have two alternatives:
    • The Galera option wsrep_causal_reads=1 option enables you to configure OX with its replication monitor disabled (com.openexchange.database.replicationMonitor=false in configdb.properties). This is the setup which seems to perform best according to our experience as turning off the replication monitor reduces the commits on the DB and thus the write operations per second on the underlying storage significantly, which outweights the drawback from having higher commit latency due to fully synchronous mode.
    • Alternatively, you can run Galera with wsrep_causal_reads=0 when switching on OX builtin replication monitor. This is also a valid setup.
  • Use a designated floating IP for the read requests: This eliminates the need of a load balancer. With this option you will not gain any performance, but the quantitative benefit is unclear anyhow.
  • Use the floating IP for the writes also for the reads: In this scenario, you direct all database queries only to one Galera node, and the other two nodes are only getting queries in case of a failure of that node. In this case, you can even use wsrep_causal_reads=0 while still having OX builtin replication monitor switched off. However we do not expect this option to be superior to the round-robin loadbalancer approach.

Loadbalancer options

While the JDBC driver has some round-robin load balancing capabilities built-in, we don't recommend it for production use since it lacks possibilities to check the Galera nodes health states.

Loadbalancers used for OX -> Galera loadbalancing should be able to implement active-passive instances for the write requests, and active-active (round-robin) instances for the read requests. (If they cannot implement active-passive, you can still take a floating IP therefore.) Furthermore it is required to configure node health checks not only on the TCP level (by a simple connect), but to query the Galera health status periodically, evaluating Galera WSREP status variables. Otherwise split-brain scenarios or other bad states cannot be detected. For an example of such an health check, see our Clustercheck page.

Some customers use loadbalancing appliances. It is important to check that if the (virtual) infrastructure offers "loadbalancer" instances that they satisfy the given requirements. Often this is not the case. In particular, a simple "DNS round robin" approach is not viable.

LVS/ipvsadm/keepalived

If you want to create your own loadbalancers based on Linux, we usually recommend LVS (Linux Virtual Servers) controlled by Keepalived. LVS is a set of kernel modules implementing a L4 loadbalancer which performs quite well. Keepalived is a userspace daemon to control LVS rules, using health checks to reconfigure LVS rules if required. Keepalived / LVS requires one (or, for availability, two) dedicated linux nodes to run on. This can be a disadvantage for some installations, but usually, it pays off. We provide some configuration information on Keepalived here.

MariaDB Maxscale

Since Maxscale has become GA in 2015, it seems to have undergone significant stability, performance and functional improvements. We are currently experimenting with Maxscale and share our installation / configuration knowledge here. It looks quite promising and might become the standard replacement for HAproxy, while we still presume Keepalived offers superior robustness and performance, coming with the cost of the requirement for one (or more) dedicated loadbalancer nodes.

HAproxy

In case where the Keepalived based approach is not feasible due to its requirements on the infrastructure, it is also possible to use a HAproxy based solution where HAproxy processes run on each of the OX nodes, configured for one round-robin and one active/passive instance. OX is then connecting to the local HAproxy instances. It is vital to configure HAproxy timeouts different from the defaults, otherwise HAproxy will kill active DB connections, causing errors. Be aware that in large installations the number of (distributed) HAproxy instances can get quite large. Some configuration hints for HAproxy are available here.

Master/Slave database setup

While we also support also "legacy" (pre-GTID) Master/Slave replication, we recommend to use GTID based replication, for easier setup and failure recovery. Support for GTID based replication has been added with OX 7.8.0.

GTID has been available since MySQL 5.6, so no 5.5 installation instructions below, sorry. We try to be generic in this documentation (thus, applicable to Oracle Community Edition and MariaDB) and point out differences where needed. Note: Instructions below include information about Oracle Community MySQL 5.7 which is not yet formally supported.

Preparations

Our configuration process includes wiping and reinitializing the datadir. This is usually not a problem in a fresh installation. If you want to upgrade an existing database to GTID master-slave, please be prepared to wipe the datadir, i.e. take a mysqldump for later restoration into the properly configured master.

Depeding on the flavor of the current database, this can be something like

# mariadb or oracle mysql without GTIDs
mysqldump --databases configdb oxdb_{5..14} > backup.sql

# mysql 5.6 with GTIDs... we dont want GTIDs here
mysqldump --databases --set-gtid-purged=OFF configdb oxdb_{5..14} > backup.sql

Be sure to verify the list of databases.

Installation

Software installation is identical for master and slave.

Please follow the instructions for installing from The vendors.

Stop the service (if it is running):

service mysql stop

Configuration

Configuration as per configuration files is also identical for master and slave.

Consult My.cnf for general recommendations how to configure databases for usage with OX.

For GTID based replication, make sure you add some configurables to a new /etc/mysql/ox.conf.d/gtid.cnf file (assuming you are following our proposed schema of adding a !includedir /etc/mysql/ox.conf.d/" directive to /etc/mysql/my.cnf):

# GTID
log-bin=mysql-bin
server-id=...
log_slave_updates = ON

Oracle Community Edition: we need to add also

enforce_gtid_consistency = ON
gtid_mode = ON

(GTID mode is on by default on MariaDB.)

Use unique a server-id for each server; like 1 for the master, 2 for slave. For more complicated setups (like multiple slaves), adjust accordingly.

Since applying our configuration / sizing requires reinitialization of the MySQL datadir, we wipe/recreate it. Caution: this assumes we are running an empty database. If there is data in the database you want to keep, use mysqldump. See Preparation section above.

So, to initialize the datadir:

cd /var/lib/
mv mysql mysql.old.datadir
mkdir mysql
chown mysql.mysql mysql

(When coming from an existing installation, be sure to wipe also old binlogs. They can confuse the server on startup. Their location varies by configuration.)

The step to initialize the datadir is different for the different DBs:

# MariaDB 10.0, 10.1
mysql_install_db

# MariaDB 10.2
mysql_install_db --user=mysql

# Oracle 5.6
mysql_install_db -u mysql

# Oracle 5.7
mysqld --initialize-insecure --user=mysql

(Don't be worried about the insecure, it just means we set the db root pw in the next steps.)

Then:

service mysql restart
mysql_secure_installation

We want to emphasize the last step to run "secure".

Steps up to here apply to both the designated master and slave. The next steps will apply to the master.

Replication Setup

Master Setup

Create a replication user on the master (but, as always, pick your own password, and use the same password in the slave setup below):

mysql -e "CREATE USER 'repl'@'gtid-slave.localdomain' IDENTIFIED BY 'IvIjyoffod2'; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'gtid-slave.localdomain';"

Now would also be the time to restore a previously created mysqldump, or add other users you need for adminstration, monitoring etc (like debian-sys-maint@localhost, for example). Adding the OX users is explained below ("Creating Open-Xchange user").

# If you took a dump for restore before
mysql < backup.sql

To prepare for the initial sync of the slave, set the master read-only:

mysql -e "SET @@global.read_only = ON;"

Create a dump to initialize the slave:

# MariaDB
mysqldump --all-databases --triggers --routines --events --master-data --gtid > master.sql

# Oracle
mysqldump --all-databases --triggers --routines --events --set-gtid-purged=ON > master.sql

Transfer to the slave:

scp master.sql gtid-slave:

Slave Setup

Configure the replication master settings. Note we don't need complicated binlog position settings etc with GTID.

Yet again DB-specific (use the repl user password from above):

# MariaDB
mysql -e 'CHANGE MASTER TO MASTER_HOST="gtid-master.localdomain", MASTER_USER="repl", MASTER_PASSWORD="IvIjyoffod2";'

# Oracle
mysql -e "CHANGE MASTER TO MASTER_HOST='gtid-master.localdomain', MASTER_USER='repl', MASTER_PASSWORD='IvIjyoffod2', MASTER_AUTO_POSITION=1;"
# https://www.percona.com/blog/2013/02/08/how-to-createrestore-a-slave-using-gtid-replication-in-mysql-5-6/
mysql -e "RESET MASTER;"

Read the master dump:

mysql < master.sql

Start replication on the slave:

mysql -e 'START SLAVE;'
mysql -e 'SHOW SLAVE STATUS\G'

Master Setup (continued)

Finally, unset read-only on the master:

# on the master
mysql -e "SET @@global.read_only = OFF;"

Configure OX to use with Master/Slave replication

Not much special wisdom here. OX was designed to be used with master/slave databases. For the ConfigDB, configdb.properties allows configuration of a readUrl and writeUrl (both of which are set to the correct values if you use oxinstaller with the correct arguments --configdb-readhost, --configdb-writehost).

(Obviously, the master is for writing and the slave is for reading.)

For the individiual UserDBs, use registerdatabase -m true for the masters and registerdatabase -m false -M ... for the respective slaves.

Be sure to have enabled the replication monitor in configdb.properties: com.openexchange.database.replicationMonitor=true (which it is by default); while GTID can show synchronous semantics, it is specified to silently fall back to asynchronous in certain circumstances, so synchronity is not guaranteed.

We recommend, though, to not register the databases directly by their native hostname or IP, but rather use some kind of HA system in order to be able to easily move a floating/failover IP from the master to the slave in case of master failure. Configuring and running such systems (like, corosync/pacemaker, keepalived, or whatever) is out of scope of this documentation, however.

Creating Open-Xchange user

Now setup access for the Open-Xchange Server database user 'openexchange' to configdb and the oxdb for both groupware server addresses. These databases do not exist yet, but will be created during the Open-Xchange Server installation.

Notes:

  • Please use a real password.
  • The IPs in this example belong to the two different Open-Xchange Servers, please adjust them accordingly.
  • If using a database on the same host as the middlware (usually done for POCs and demo installations), you need to grant also to the localhost host.
  • Consult AppSuite:DB_user_privileges (or grep GRANT /opt/open-xchange/sbin/initconfigdb) for an up-to-date list of required privileges. The following statement was correct as of the time of writing this section.
mysql> GRANT CREATE, LOCK TABLES, REFERENCES, INDEX, DROP, DELETE, ALTER, SELECT, UPDATE, INSERT, CREATE TEMPORARY TABLES, SHOW VIEW, SHOW DATABASES ON *.* TO 'openexchange'@'10.20.30.213' IDENTIFIED BY 'IntyoyntOat1'  WITH GRANT OPTION;
mysql> GRANT CREATE, LOCK TABLES, REFERENCES, INDEX, DROP, DELETE, ALTER, SELECT, UPDATE, INSERT, CREATE TEMPORARY TABLES, SHOW VIEW, SHOW DATABASES ON *.* TO 'openexchange'@'10.20.30.215' IDENTIFIED BY 'IntyoyntOat1'  WITH GRANT OPTION;