Difference between revisions of "AppSuite:Cross folder fulltext search with Dovecot"

 
(2 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
= Cross-folder fulltext search with Dovecot =
 
= Cross-folder fulltext search with Dovecot =
  
With 7.6.0 Open-Xchange introduces a new search API and an according extension
+
The content on this page has moved to https://documentation.open-xchange.com/7.10.2/middleware/.
for OX App Suite. The new mail search can utilize fulltext and cross-folder
 
search capabilities, as they are provided by Dovecot. This article aims to be a
 
short walkthrough for setting up Dovecot and the OX App Suite backend
 
accordingly. I assume that you already have a working Dovecot installation that
 
is used as the primary mail backend for your (again already existing and basically
 
configured) OX App Suite installation. I further assume that you are running
 
Dovecot in version 2.2.9 and OX App Suite 7.6.0 on Debian Wheezy. We will use
 
[[https://lucene.apache.org/solr/|Solr]] as mail index.
 
  
== Dovecot ==
+
Note: Open-Xchange is in the process of migrating all its technical documentation to a new and improved documentation system (documentation.open-xchange.com). Please note as the migration takes place more information will be available on the new system and less on this system. Thank you for your understanding during this period of transition.
 
 
Both search features are realized via Dovecot plugins. Fulltext search relies on
 
[[http://wiki2.dovecot.org/Plugins/FTS|FTS]] and [[http://wiki2.dovecot.org/Plugins/FTS/Solr|FTS Solr]].
 
Cross-folder search is realized via a virtual folder, that claims to contain all
 
mails from all other folders. Debian Wheezy comes with Dovecot 2.1.7. As this version is quite old, we take the packages from the backports repository, which contains version 2.2.9. Nevertheless this should all work with 2.1.7 also. After adding the backports entries my /etc/apt/sources.list looks like this:
 
 
 
deb http://ftp2.de.debian.org/debian/ wheezy main contrib non-free
 
deb-src http://ftp2.de.debian.org/debian/ wheezy main contrib non-free
 
 
 
deb http://ftp2.de.debian.org/debian/ wheezy-updates main contrib non-free
 
deb-src http://ftp2.de.debian.org/debian/ wheezy-updates main contrib non-free
 
 
 
deb http://ftp2.de.debian.org/debian/ wheezy-backports main contrib non-free
 
deb-src http://ftp2.de.debian.org/debian/ wheezy-backports main contrib non-free
 
 
 
=== Installing and configuring the fulltext index ===
 
 
 
We start with installing Solr. I use solr-jetty here because it's easy to
 
configure and more leightweight than a full tomcat installation. Later we will
 
use the Solr admin panel to see if everything works. The admin panel uses Java
 
Server Pages (JSP), therefore we also need a JDK. We install both via
 
 
 
# aptitude install solr-jetty openjdk-6-jdk
 
 
 
We also need Dovecots FTS Solr plugin. When installing Dovecot packages, we have
 
to explicitly choose the backports repository or we get the original 2.1.7
 
packages. Install the plugin via
 
 
 
# aptitude -t wheezy-backports install dovecot-solr
 
 
 
Before we can start Jetty, which in turn starts Solr as a web app, we have to
 
configure it. The Solr admin panel refers to a shared jQuery library, that is
 
linked into the web app. Therefore we have to allow the delivery of symlinks by
 
Jetty. Open ''/etc/jetty/webdefault.xml'' and add the following to the
 
''<servlet>''-section:
 
 
 
<init-param>
 
      <param-name>aliases</param-name>
 
      <param-value>true</param-value>
 
</init-param>
 
 
 
Then open ''/etc/default/jetty'' and change ''NO_START'' to
 
''0''. To access Solrs admin panel Jetty must accept connections from
 
the outside. So you have to adjust ''JETTY_HOST'' accordingly. Further
 
there is a problem with the softlink that links the solr webapp into Jetty.
 
''/var/lib/jetty/webapps/solr'' points to ''/usr/share/solr/webapp'',
 
what is wrong. The correct path is ''/usr/share/solr/web''. We fix this
 
with
 
 
 
# rm /var/lib/jetty/webapps/solr
 
# ln -s /usr/share/solr/web /var/lib/jetty/webapps/solr
 
 
 
You may also check the permissions for the jquery library. By default, it's a symlink at ''/usr/share/solr/web/admin/jquery-1.4.3.min.js'' which throws a 404 when visiting the solr admin panel. Examinate ''/var/log/jetty/*stdout.log'' for any Jetty related issues.
 
 
 
The productive use has shown to change these values in the '' /etc/jetty/jetty.xml'' to get jetty work with big mailboxes:
 
 
 
<Set name="maxIdleTime">100000</Set>
 
<Set name="headerBufferSize">65536</Set>
 
 
 
Now you can start Jetty with
 
 
 
# service jetty start
 
 
 
If everything was done correctly, you should reach Solrs admin panel at ''%%http://<ip or host>:8080/solr/admin%%''.
 
 
 
Now its time to activate the FTS Solr plugin in Dovecot and to make use of our freshly set up indexing server. Open ''/etc/dovecot/conf.d/10-mail.conf'' and change
 
 
 
#mail_plugins</code>
 
to
 
mail_plugins = fts fts_solr
 
 
 
To configure the plugins open ''/etc/dovecot/conf.d/90-plugin.conf'' and change the ''plugin''-section to
 
 
 
plugin {
 
  fts = solr
 
  fts_autoindex = yes
 
  fts_solr = url=http://localhost:8080/solr/
 
}
 
 
 
''fts_autoindex = yes'' will enable automatic indexing of new mails. Dovecot 2.2.9 or newer is required by this feature.
 
 
 
Solr uses a so called schema that defines the structure of the index. A schema is defined in an XML file. The schema for Dovecot is provided by the dovecot-solr package and can be found at ''/usr/share/doc/dovecot-core/dovecot/solr-schema.xml''. We have to replace Solrs default schema with this one:
 
 
 
# cp /usr/share/doc/dovecot-core/dovecot/solr-schema.xml /etc/solr/conf/schema.xml
 
 
 
Afterwards we restart Jetty and Dovecot to apply the changes.
 
 
 
# service jetty restart
 
# service dovecot restart
 
 
 
Now we can start indexing some mails, to see if everything works. According to [[http://wiki2.dovecot.org/Plugins/FTS/Solr|the FTS-Solr manual]] we can index a mailbox with
 
 
 
# doveadm fts rescan -u <user>
 
# doveadm index -u <user> '*'
 
 
 
The first call instructs Dovecot to remember that a rebuild of all indexes for the given user is necessary. The second one then forces the rebuild-operation and blocks until all mails are indexed. We use '''*''' as a wildcard for all mailboxes of the given user here. ''%%http://<ip or host>:8080/solr/admin/schema.jsp%%''
 
should now show the ''numDocs'' value as equal to the number of mails
 
contained in the mailbox.
 
 
 
Congratulations, searching within mail bodies now utilizes Solr and is blazing fast!
 
 
 
=== Configuring the all-messages folder ===
 
 
 
To enable cross-folder search we configure Dovecot to add a special folder to every mailbox. From the outside the folder looks like it contains all mails from all other folders. Some more information can be found [[http://wiki2.dovecot.org/Plugins/Virtual|here]].
 
 
 
We start with creating an own namespace for the virtual folder. Open ''/etc/dovecot/conf.d/10-mail.conf'' and add the following section:
 
 
 
namespace virtual {
 
  prefix = virtual.
 
  separator = .
 
  location = virtual:/etc/dovecot/virtual:INDEX=~/virtual
 
}
 
 
 
Additionally you have to extend the ''mail_plugins''-directive to ''mail_plugins = fts fts_solr virtual''. Now we can create arbitrary virtual folders that appear in every users accounts by simply creating a directory below ''/etc/dovecot/virtual'' and configuring that folder by placing a file ''dovecot-virtual'' within it. We do this for our all-messages folder:
 
 
 
# mkdir -p /etc/dovecot/virtual/all
 
 
 
and create a file ''/etc/dovecot/virtual/all/dovecot-virtual'' with the following content:
 
 
 
INBOX/*
 
  all
 
INBOX
 
  all
 
 
 
This means, the folder aggregates all messages from all mailboxes. The ''*'' therefore selects all mailboxes. You likely want to change this, if you have additional namespaces for shared and public folders. Please refer to the Dovecot Wiki in this case.
 
 
 
For App Suite to be able to display the original folders within the search results Dovecot needs to announce an additional capability which needs to be added via
 
 
 
imap_capability = +XDOVECOT
 
 
 
The folder is already visible now in every users account. In the last step we configure the folder to carry the ''\All''-flag. Edit ''/etc/dovecot/conf.d/15-mailboxes.conf'' and add the following section:
 
 
 
namespace virtual {
 
  mailbox all {
 
    special_use = \All
 
  }
 
}
 
 
 
Then restart Dovecot again. A short test reveals, that everything works as expected:
 
 
 
# telnet <imap_host> 143
 
Trying <ip>...
 
Connected to <ip>.
 
Escape character is '^]'.
 
* OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE STARTTLS AUTH=PLAIN] Dovecot  ready.
 
. LOGIN <username> <password>
 
. OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT SORT=DISPLAY  THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS SPECIAL-USE BINARY MOVE SEARCH=FUZZY] Logged in
 
. NAMESPACE
 
* NAMESPACE (("" ".")("virtual." ".")) NIL NIL
 
. OK Namespace completed.
 
. LIST "" "*"
 
* LIST (\HasNoChildren \Trash) "." Trash
 
* LIST (\HasNoChildren) "." "Sent Items"
 
* LIST (\HasNoChildren \Drafts) "." Drafts
 
* LIST (\HasNoChildren) "." Spam
 
* LIST (\Noselect \HasChildren) "." virtual
 
* LIST (\HasNoChildren \All) "." virtual.all
 
* LIST (\HasNoChildren) "." INBOX
 
. OK List completed.
 
. SELECT virtual.all
 
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft $cl_0)
 
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft $cl_0 \*)] Flags permitted.
 
* 6 EXISTS
 
* 0 RECENT
 
* OK [UNSEEN 4] First unseen.
 
* OK [UIDVALIDITY 1395244998] UIDs valid
 
* OK [UIDNEXT 7] Predicted next UID
 
* OK [NOMODSEQ] No permanent modsequences
 
. OK [READ-WRITE] Select completed (0.004 secs).
 
. LOGOUT
 
* BYE Logging out
 
. OK Logout completed.
 
 
 
== OX App Suite ==
 
The last step is to configure OX App Suite. At first open ''/opt/open-xchange/etc/imap.properties'' and change the value of ''com.openexchange.imap.imapSearch'' to ''force-imap''. Then open ''/opt/open-xchange/etc/findbasic.properties'' and change it to
 
 
 
# Some mail backends provide a virtual folder that contains all messages of
 
# a user to enable cross-folder mail search. Open-Xchange can make use of
 
# this feature to improve the search experience.
 
#
 
# Set the value to the name of the virtual mail folder containing all messages.
 
# Leave blank if no such folder exists.
 
com.openexchange.find.basic.mail.allMessagesFolder = virtual.all
 
 
 
# Denotes if mail search queries should be matched against the mail body.
 
# This improves the search experience within the mail module, if your mail
 
# backend supports fast full text search. Otherwise it will slow down the
 
# search requests significantly.
 
#
 
# Change the value to 'true', if fast full text search is supported. Default
 
# is 'false'.
 
com.openexchange.find.basic.mail.searchmailbody = true
 
 
Restart the server with
 
 
 
/etc/init.d/open-xchange restart
 
 
 
TODO: add image from App Suite UI showing the "All Folder"-Switch and explain that queries are now matched against mail bodies in addition to subject and address headers.
 

Latest revision as of 09:15, 15 May 2019

This information is valid from 7.6.0

Cross-folder fulltext search with Dovecot

The content on this page has moved to https://documentation.open-xchange.com/7.10.2/middleware/.

Note: Open-Xchange is in the process of migrating all its technical documentation to a new and improved documentation system (documentation.open-xchange.com). Please note as the migration takes place more information will be available on the new system and less on this system. Thank you for your understanding during this period of transition.