Friday, May 16, 2014

Installing fast indexed search for IMAP e-mail

With Dovecot and apache SOLR it's possible to install very fast mail search. You can search through the entire message contents within fractions of a second.

I've used Ubuntu and Debian to set this up.

Tomcat SOLR


Install the tomcat solr server with the following command:

$ apt-get install solr-tomcat

Find the SOLR schema on page http://hg.dovecot.org/dovecot-2.2/file/tip/doc/solr-schema.xml

Make sure your dovecot version matches:

Dovecot 2.2: (Ubuntu 14.04)
http://hg.dovecot.org/dovecot-2.2/raw-file/e99cd21e1f92/doc/solr-schema.xml

Dovecot 2.1 (Debian Wheezy)
http://hg.dovecot.org/dovecot-2.1/raw-file/300a3a81c2cb/doc/solr-schema.xml

Download it like this:

$ wget http://hg.dovecot.org/dovecot-2.2/raw-file/e99cd21e1f92/doc/solr-schema.xml

Move the file to the right location:

$ mv /etc/solr/conf/schema.xml /etc/solr/conf/schema.xml.bak
$ mv solr-schema.xml /etc/solr/conf/schema.xml

$ service tomcat6 restart

Check if tomcat works by browsing to:

http://localhost:8080

Note: I use Proxmox with openvz and it was necessary to assign 2 cpu's to the openvz container to get tomcat to run properly!

Security note
The admin page is publicly accessible by default! So I made sure only local connections are allowed by adding this line between the <Host></Host> tag in /etc/tomcat6/server.xml

<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127.0.0.1"/>

Dovecot

Now setup Dovecot.

Install the plugin package:

$ apt-get install dovecot-solr

Then modify:

/etc/dovecot/conf.d/10-mail.conf:

...
# Space separated list of plugins to load for all services. Plugins specific to
# IMAP, LDA, etc. are added to this list in their own .conf files.
#mail_plugins =
mail_plugins = $mail_plugins fts fts_solr
...


/etc/dovecot/conf.d/90-plugin.conf:

...
plugin {
  fts = solr
  fts_solr = break-imap-search url=http://127.0.0.1:8080/solr/
}
...

The "break-imap-search" option will use Solr also for indexing TEXT and BODY searches. This makes your server non-IMAP-compliant, but it's what people want ;). This is always enabled in v2.1+.

Now when an IMAP client does a "SEARCH TEXT keyword command" you should see these log entries in /var/log/mail.log:

May 15 14:19:29 mail.example.com dovecot: indexer-worker(admin@intermesh.dev): Indexed 294 messages in INBOX

For more info read: http://wiki2.dovecot.org/Plugins/FTS/Solr

Enjoy the fast searches!

3 comments:

  1. Is it also possible to search with more than just one keyword?

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete