Administration Guide

Dovecot Cluster Architecture

Dovecot Proxy

Dovecot Proxies are IMAP/POP3/LMTP proxies that are typically only needed in multi-site setups. Their job is to simply look up the user’s current site from passdb and proxy the connection to that site’s Dovecot Director cluster. User is also typically authenticated at this stage.

If the storage between sites is replicated, it’s possible to do site failover. Deciding when to do a site failover can be either a manual process or it can be done via an automated watchdog. The failover shouldn’t be done too quickly, because it will cause a large load spike when a lot of users start logging into the failover site where users have no local caches. So typically the watchdog script should wait at least 5 minutes to see if the network between sites comes back up before deciding that the other side is down.

Once it’s decided that a site failover should be done, the passdb needs to be updated to switch the affected users’ site to the fallback site. Normally this is done with LDAP passdb by keeping track of username -> virtual site ID and virtual site ID -> IP address. Each physical site would have about 10-100 virtual site IDs. On failover the failed site’s virtual IDs’ IP addresses are updated. This way only a few records are updated instead of potentially millions of user records. Having multiple virtual site IDs per physical site has two advantages: 1) If there are more than two physical sites, it allows distributing the failed site’s users to multiple failover sites. 2) When the original site comes back up the users can be restored to it one virtual site at a time to avoid a load spike.

Note that during a split brain both sites may decide that the other site isn’t available and redirect all incoming connections to the local site. This means that both sites could modify the same mailbox simultaneously. With the Dovecot Object Storage backend this behavior is fine. When split brain is over the changes will be merged, so there is no data loss. The merging reduces the performance temporarily though, so it shouldn’t be relied on during normal operation.

If you wish to reduce the amount of needed hardware, Dovecot Proxies don’t necessarily need to be separated from Dovecot Directors. A single Dovecot instance can perform both operations. The only downside is that it slightly complicates understanding what the server is doing.

Dovecot Director

Dovecot Directors are IMAP/POP3/LMTP proxies that do load balancing and high-availability for the Dovecot Backends. They perform a job similar to a stateful load balancer: The main difference between a regular load balancer and Dovecot Director is that the director makes sure that a single user is never accessed by different backends at the same time. This is needed to keep the performance good and to avoid potential problems. In front of Dovecot directors there needs to be a load balancer to provide high availability for them.

Dovecot Directors connect to each others with TCP in a ring formation (each director connects to the next one, while the last one connects to the first one). This ring is used to distribute the current global state of the cluster, so any of the directors can die without losing state.

Normally the directors determine the backend server for users based on the MD5 hash of the username. This usually gives a good distribution of users to backends and it’s very efficient for the directors: usually a director can determine the correct backend for a user without talking to any other directors. Only in some special situations, like when a backend has recently been removed, the director cluster will temporarily perform worse with slightly higher latency, because they need to talk to each others to determine the current state. This usually takes less than a second to get back to normal.

When a user logs in, the user will be assigned to a specific backend if it’s not already done. This assignment will last for 15 minutes after the user’s last session has closed. Afterwards it’s possible that the user may end up in a different backend. It’s also possible to explicitly move users around in the cluster (doveadm director move).

It’s possible to assign different amount of work for different director servers by changing their “vhost count”. By default each server has it set to 100. If you want one server to have double the number of users, you can set its vhost count to 200. Or if you want one server to have half the number of users, you can set its vhost count to 50. So for example if vhost counts for 3 backends are A=50, B=100, C=200, the probabilities of backends getting connections are:

  • A: 50/(50+100+200) = 14%
  • B: 100/(50+100+200) = 29%
  • C: 200/(50+100+200) = 57%

Changing the vhost count affects only newly assigned users, so it doesn’t have an immediate effect. Running doveadm director flush causes the existing connections to be moved immediately.

Dovecot Backend

The Dovecot Backend does all the hard work of reading and writing mails to storage and handling all of the IMAP/POP3/LMTP protocols. Dovecot Backend is connected to the object storage where users’ mails and mail indexes are stored.

As a user is connecting to Dovecot for reading mails, the user’s mail indexes are fetched from the object storage and cached in local file system. The mail indexes are updated locally while the user does mailbox modifications. The modified local indexes are uploaded back to object storage on background every 5 minutes, except for LMTP mail deliveries. With LMTP mail deliveries the indexes are uploaded only every 10th mail (obox_max_rescan_mail_count setting) to avoid unnecessary object storage writes. The index updates for LMTP deliveries don’t contain anything that can’t be recreated from the mails themselves.

Dovecot Backends are stateless, so should the server crash the only thing lost for the logged in users are the recent message flag updates. When user logs in the next time to another backend, the indexes are fetched again from the object storage to local cache. Because LMTP mail deliveries don’t update indexes immediately, the email objects are also listed once for each accessed folder to find out if there are any newly delivered mails that don’t exist yet in the index.

Dovecot backends attempt to do as much in local cache as possible to minimise the object storage I/O. The larger the local cache the less object storage I/O there is. Typically you can count that each backend should have at least 2 MB of local cache allocated for its active users (e.g. if there are 100 000 users per backend who are receiving mails or who are accessing mails within 15 minutes, there should be at least 200 GB of local cache on the backend). It’s important that the local cache doesn’t become a bottleneck, so ideally it would be using SSDs. Alternatives are to use in-memory disk (tmpfs) or filesystem on SAN that provides enough disk IOPS. (NFS should not be used for local cache.) Dovecot never uses fsyncing when writing to local cache, so after a server crash the cache may be inconsistent or corrupted. This is why the caches should be deleted at server boot up.

Password databases (passdb) and User Databases (userdb)

Dovecot splits all authentication lookups into two categories:

  • passdb lookup most importantly authenticate the user. They also provide any other pre-login information needed for users, such as:
    • Which server user is proxied to.
    • If user should be allowed to log in at all (temporarily or permanently).
  • userdb lookup retrieves post-login information specific to this user. This may include:
    • Mailbox location information
    • Quota limit
    • Overriding settings for the user (almost any setting can be overridden)
Passdb lookups are done by: Dovecot Director Dovecot Backend
IMAP & POP3 logins yes yes
LMTP mail delivery yes -
doveadm commands yes -
Userdb lookups are done by: Dovecot Director Dovecot Backend
IMAP & POP3 logins - yes
LMTP mail delivery - yes
doveadm commands - yes

Prefetch Userdb

During IMAP & POP3 logins to Dovecot backend both passdb and userdb lookups are performed. To avoid two LDAP lookups a prefetch userdb is used. This simply means that the passdb lookup is configured to return both passdb and userdb fields with the userdb fields prefixed with “userdb_” string. This slightly complicates the configuration though, because now when doing userdb changes they need to be remembered to be done to both passdb and userdb configuration.

Object Storage Plugin

Dovecot obox format is split into two main categories: mail object handling and index object handling.

Mail Objects

The mail object handling is easy enough: Each mail is stored in its own separate object. The object name is a uniquely generated name, which we call object ID (OID). The mails are also cached locally using a fscache wrapper, which uses a global cache directory with a configurable max size. If the object storage access is fast, this cache doesn’t need to be very large, but it should still exist. A small cache that usually stays in memory is likely good (e.g. 1 GB).

The mail object names look like: user-hash/user@domain/mailboxes/folder-guid/oid For example: b5/899/user@example.com/mailboxes/00d7d12ea08a3153175e0000dfbea952/d88ff1001d4bf753a1b800001accfe22

Index Objects

Dovecot obox format uses the normal Dovecot index file formats, except they are packed into index bundles when they are stored to object storage. The indexes are written lazily to the object storage in order to minimize the object storage I/O.

There are two types of index bundles: base bundles and diff bundles. The base bundles may be large and they are updated somewhat rarely. The diff bundles contain the latest changes since the base bundle and are the ones usually updated. This is done to avoid constantly uploading large index objects even though very little had changed.

All objects are created with unique object names. This guarantees that two servers can’t accidentally overwrite each others’ changes. Instead, what happens is that there may be two conflicting index bundle objects. If Dovecot notices such conflict, it merges the conflicting indexes using dsync algorithm without data loss. This allows active-active multi-site setups to run safely during a split brain.

The base index object names look like: user-hash/user@domain/mailboxes/folder-guid/idx/bundle.timestamp-secs.timestamp-usecs.unique-id For example: b5/899/user@example.com/mailboxes/00d7d12ea08a3153175e0000dfbea952/idx/bundle.53f74dc2.0fbcf.c96d802b5d4df75307bb00001accfe22

The diff index object names look the same, except another “-unique-id” is appended after the base bundle name.

Example Use Cases

Example 1: Receiving a Mail

  1. Mail is sent by a user using an email client, which sends the mail to the user’s own MTA (Mail Transport Agent).
  2. Mail is received by the destination user’s MTA.
  3. MTA performs antispam and antivirus checks and potentially rejects the mail or tags it with extra headers to indicate it’s spam.
  4. Mail is sent to the Dovecot Proxy with LMTP protocol.
    1. The proxy is chosen by load balancer.
  5. Dovecot Proxy performs a passdb lookup (from LDAP) to find out the user’s primary site.
  6. Dovecot Proxy forwards the LMTP connection to the correct site’s director cluster (local or remote).
    1. The director is chosen by load balancer (e.g. HAproxy).
  7. Dovecot Director looks up or assigns a Dovecot backend for the user and forwards the LMTP connection to the Dovecot Backend.
  8. Dovecot Backend performs userdb lookup to find where and how to save the mail.
  9. Dovecot Backend saves the mail to object storage:
    1. Check if the user’s local cache is up-to-date (list user’s index objects)
    2. If not, fetch the user’s index objects to local cache (1-2 GETs)
    3. Check if the INBOX exists in local cache
      1. If not, fetch the INBOX’s index objects to local cache (1-2 GETs). Also list email objects in INBOX to find any new emails that don’t exist in the index yet (backend failover).
    4. For each new email object found, lookup their GUID and add it to index (1 HEAD per new email)
    5. There is normally a maximum of 10 new email objects (obox_max_rescan_mail_count setting)
    6. Upload the mail to object storage
      1. Write the mail to local fscache
    7. Modify the local indexes
      1. Usually the indexes aren’t uploaded, but every 10th mail (obox_max_rescan_mail_count setting) the indexes are uploaded to object storage (1 PUT + 1 DELETE)

Example 2: Reading a Mail

  1. User connects to Dovecot cluster with an IMAP client, possibly via a webmail.
  2. The IMAP client connects to Dovecot Director IMAP Proxy
    1. The proxy is chosen by load balancer.
  3. Dovecot Proxy performs a passdb lookup (from LDAP) to find out the user’s primary site.
    1. During migration the passdb lookup would direct non-migrated users to the old system.
  4. Dovecot Proxy forwards the IMAP connection to the correct site’s director cluster (local or remote).
    1. The director is chosen by load balancer (e.g. HAproxy).
  5. Dovecot Director looks up or assigns a Dovecot backend for the user and forwards the IMAP connection to the Dovecot Backend.
  6. Dovecot Backend performs a userdb lookup to find where and how to access user’s mails.
  7. Dovecot Backend checks if the user’s local cache is up-to-date (list user’s index objects)
    1. If not, fetch the user’s root index objects to local cache (1-2 GETs)
  8. The IMAP client opens INBOX folder.
    1. Dovecot Backend checks if the INBOX exists in local cache
      1. If not, fetch the INBOX’s index objects to local cache (1-2 GETs). Also list email objects in INBOX to find any new emails that don’t exist in the index yet (backend failover).
        1. For each new email object found, lookup their GUID and add it to index (1 HEAD per new email)
        2. There is normally a maximum of 10 new email objects (obox_max_rescan_mail_count setting).
  9. The IMAP client fetches metadata (e.g. headers) for new emails.
    1. Dovecot usually replies to these from the locally cached INBOX indexes without object storage access.
  10. The IMAP client fetches bodies for the new email(s).
    1. Dovecot looks up if the mail is already in local fscache and serves from there if possible.
    2. Otherwise, Dovecot retrieves the mail from object storage and writes it to local fscache (1 GET)
  11. The IMAP client sets a \Seen flag for a mail.
    1. Dovecot updates the local index.
    2. The modified index will be uploaded to object storage within the next 5 minutes (1 PUT + 1 DELETE)
  12. The IMAP client logs out.

Director Administration

Directors can be managed using the “doveadm director” commands. See “doveadm help director” man page for the full command parameters.

Backend Modifications

The backends can be changed with:

  • doveadm director add: Add a new backend or change an existing one’s vhost count.
    • New servers should also be added to the director_mail_servers setting in dovecot.conf so a cluster restart will know about it.
  • doveadm director update: Update vhost count of an existing backend. There only difference to “doveadm director add” is that it’s not possible to accidentally add a new backend.
  • doveadm director up: Mark a director as being “up”. This is the default state. This is usually updated automatically by dovemon.
  • doveadm director down: Mark a director as being “down”. This is effectively the same as changing vhost count to 0. This is usually updated automatically by dovemon.
  • doveadm director remove: Remove a backend entirely. This should be used only if you permanently remove a server.
  • doveadm director flush: Move users in one specific backend or all backends to the backend according to the user’s current hash. This is needed after “down” command or when setting vhost count to 0 to actually remove all the existing user assignments to the host.

The backend health checking is usually done by the dovemon script, which automatically scans the backends and determines if they are up or down and uses these doveadm commands to update the backend states. See the “Dovecot Pro Director Configuration Manual” for more information about dovemon.

You can see the current backend state with doveadm director status command without parameters. If you want to see which backend a user is currently assigned to and where it may end up being in future, use doveadm director status user@domain.

Cleanly Removing Backend

The cleanest way to take down a working backend server is to:

  • doveadm director update ip-addr 0
    • No longer send any new users to this backend. Wait here as long as possible for the existing connections to die (at least a few minutes would be ideal).
  • On the backend server: doveadm metacache flushall
    • Flush all pending metacache changes to object storage.
  • doveadm director flush ip-addr
    • Forget about the last users assigned to the backend and move them elsewhere.
  • On the backend server: doveadm metacache flushall
    • Final flush to make sure there are no more metacache changes.
  • If the server is permanently removed:
    • doveadm director remove ip-addr
    • Remove the server from director_mail_servers setting in dovecot.con.

Director Ring Modifications

A new director server is added by:

  • Add the server to director_servers setting so that the director is remembered even after a cluster restart.
  • doveadm director ring add command can be used to add the director to an already running ring.

A director server can be removed with doveadm director ring remove. You can see the current ring state with doveadm director ring status.

Director Disaster Recovery

Director servers share the same global state. This means that if there are some bugs, the same bug probably ends up affecting the entire director cluster. Although director is nowadays pretty well tested, it’s possible that something new unexpected happens. This chapter explains how to fix such situations if they ever happen.

In case the director ring has become somehow confused and the ring’s connections don’t look exactly correct, you can restart some of the directors (service dovecot restart), which are connected to the wrong servers. Directors should always automatically retry connecting to their correct neighbors after failures, so this manual restarting isn’t normally necessary.

Full Director State Reset

If the directors start crashing or logging errors and failing user logins, there are two ways the service could be restored:

  • doveadm director flush -F resets all the users’ state immediately. Note that this command shouldn’t be used unless absolutely necessary, because it immediately forgets all the existing user assignments and doesn’t kill any existing connections. This means that for all the active users, the same user could be simultaneously accessed by different backends.
  • A safer way would be to shutdown the entire director cluster and starting it back up from zero state. This may also be necessary if the forced director flush doesn’t work for some reason. Note that it’s not enough to simply restart each director separately, because after the restart it’ll receive the earlier state from the next running director. All the directors must be shut down first.

Mailbox Administration

Doveadm Mailbox Commands

These commands should be run on one of the Dovecot directors. The director is then responsible for forwarding the command to be run in the correct backend. This guarantees that two backend servers don’t attempt to modify the same user’s mailbox at the same time (which might cause problems).

  • doveadm fetch: Fetch mail contents or metadata.
    • doveadm search does the same as doveadm fetch ‘mailbox-guid did’. It’s useful for quick checks where you don’t want to write the full fetch command.
  • doveadm copy & move to another folder, potentially to another user.
  • doveadm reduplicate: Deduplicate mails either by their GUID or by Message-Id: header.
  • doveadm expunge: Expunge mails (without moving to Trash).
  • doveadm flags add/remove/replace: Update IMAP flags for a mail
  • doveadm force-rsync: Try to fix a broken mailbox (or verify that all is ok)
  • doveadm index: Index any mails that aren’t indexed yet. Mainly useful if full text search indexing is enabled.
  • doveadm mailbox list: List user’s folders.
  • doveadm mailbox create/delete/rename: Modify folders.
  • doveadm mailbox subscribe/unsubscribe: Modify IMAP folder subscriptions.
  • doveadm mailbox status: Quickly lookup folder metadata (# of mails, # of unseen mails, etc)

Object Storage Mailbox Format Administration

The object storage plugin administration is mainly related to making sure that the mail cache and the index cache perform efficiently and they don’t take up all the disk space.

The mail cache size is specified in the plugin { obox_fs } setting as the parameter to fscache. Usually with a fast object storage this should be a relatively small value, such as 1 GB. It’s not a user-visible problem if the fscache runs out of disk space (although it will log some errors in that case), so it might be a good idea to use a separate partition for it. If needed, you may also manually delete parts or all of the fscache with the standard rm command. Afterwards you should run doveadm fscache rescan to update the fscache index to know the updated correct size.

The index cache size is specified in the metacache_max_size setting. This should ideally be as large as possible to reduce both object storage GETs for the indexes and also local filesystem writes when the indexes are unpacked to local cache. You can also manually clean some older indexes from cache by running doveadm metacache clean command.

If multiple backends do changes to the same mailbox at the same time, Dovecot will eventually perform a dsync-merge for the indexes. Due to dsync being quite a complicated algorithm there’s a chance that the merging may trigger a bug/crash that won’t fix itself automatically. If this happens, the bug should be reported to get it properly fixed, but a quick workaround is to run: doveadm -o plugin/metacache_disable_merging=yes force-resync -u user@domain INBOX

Moving/Migrating/Converting/Exporting/Importing Mailboxes

Almost everything related to moving/converting mail accounts can be done using the dsync tool. It can do either one-way synchronization or two-way synchronization of mailboxes. See the doveadm help sync and doveadm help backup for more information. Also http://wiki2.dovecot.org/Migration/Dsyncs describes how to migrate mails from another IMAP/POP3 servers.

Mails can be also imported to an existing mailbox using doveadm import command. The new mails will be appended to their respective folders, creating the folders if necessary. It’s also possible to give a prefix for the new folders, such as “backup-restored-20140824/”.

Mails can be also continuously replicated between two Dovecot servers using the replicator service. See http://wiki2.dovecot.org/Replication for more information.

Debugging

Each IMAP, POP3 and LMTP connection has its own unique session ID. This ID is logged in all the lines and passed between Dovecot services, which allows tracking it all the way through directors to backends and their various processes. The session IDs look like <ggPiljkBBAAAAAAAAAAAAAAAAAAAAAAB>

Logging

If problems are happening, it’s much easier to see what’s going wrong if all the errors are logged into a separate log file, so you can quickly see all of them at once. With rsyslog you can configure this with:

mail.* -/var/log/dovecot.log mail.warning;mail.error;mail.crit -/var/log/dovecot.err

Another thing that often needs to be changed is to disable flood control in rsyslog. Dovecot may log a lot, especially with debug logging enabled, and rsyslog’s default settings often lose log messages.

Another way to look at recent Dovecot errors is to run doveadm log error, which shows up to the last 1000 errors logged by Dovecot since it was last started.

Authentication Debugging

Most importantly set auth_debug=yes, which makes Dovecot log a debug line for just about anything related to authentication. If you’re having problems with passwords, you can also set auth_debug_passwords=yes which will log them in plaintext.

For easily testing authentication, use: doveadm auth test user@domain password

For looking up userdb information for a user, use: doveadm user user@domain

For simulating a full login with both passdb and userdb lookup, use: doveadm auth login user@domain password

Mail Debugging

Setting mail_debug=yes will make Dovecot log all kinds of things about mailbox initialization. Note that it won’t increase error logging at all, so if you’re having some random problems it’s unlikely to provide any help.

If there are any problems with a mailbox, Dovecot should automatically fix it. If that doesn’t work for any reason, you can manually also request fixing a mailbox by running: doveadm force-resync -u user@domain INBOX Where the INBOX should be replaced with the folder that is having problems. Or ‘*’ if all folders should be fixed.

Users may sometimes complain that they have lost emails. The problem is almost always that this was done by one of the user’s email clients accidentally. Especially accidentally configuring a POP3 client to a new device that deletes the mails after downloading them. For this reason it’s very useful to enable the mail_log plugin and enable logging for all the events that may cause mails to be lost. This way it’s always possible to find out from the logs what exactly caused messages to be deleted.

If you’re familiar enough with Dovecot’s index files, you can use 'doveadm dump command to look at their contents in human readable format and possibly determine if there is something wrong in them.

Crashes

Dovecot has been designed to rather crash than continue in a potentially unsafe manner that could cause data loss. Most crashes usually happen just once and retrying the operation will succeed, so usually even if you see them it’s not a big problem. Of course, all crashes are bugs that should eventually be fixed, so feel free to report them always even if they’re not causing any visible problems. Reporting crashes is usually best accompanied with a gdb backtrace as described in http://dovecot.org/bugreport.html

Instead of crashing, there are have have been some rare bugs in Dovecot when some process could go into infinite loop, which causes the process to use 100% CPU. If you detect such processes, it would be very helpful again to get a gdb backtrace of the running process:

  • gdb -p pid-of-process
  • bt full

After getting the backtrace, you can just kill -9 the process.

Quota

User’s current quota usage can be looked up with: doveadm quota get -u user@domain

User’s current quota may sometimes be wrong for various reasons (typically only after some other problems). The quota can be recalculated with: doveadm quota recalc -u user@domain

Sieve

When Sieve scripts are uploaded using the ManageSieve service, they’re immediately compiled and the script upload will fail if any problems were detected. Not all problems can be detected at compile time however, so it’s also possible that the Sieve script will fail during runtime. In this case the errors will be written to the .dovecot.sieve.log file (right next to the .dovecot.sieve file itself in user’s home directory).

Stress Testing

Easiest way to stress test Dovecot is to use the imaptest tool: http://imapwiki.org/ImapTest. It can be used to flood a server with random commands and it can also attempt to mimic a large number of real-world clients.