Foobacca

Geeking about

Custom Reports With Notmuch

It’s just a little thing, but I like to see counts of messages in certain categories. As notmuch provides a count command, it is simple to write a script that will provide you with the counts you want. Here’s one I prepared earlier:

#!/bin/sh

# I'm interested in the thread count rather than individual messages
NMC="/usr/bin/notmuch count --output=threads"

# I break it down by total, how many are flagged and how many are unread
# This function produces a single line
function tag_report(){
  TAG=$1
  TOTAL_COUNT=$($NMC tag:$TAG)
  FLAGGED_COUNT=$($NMC tag:$TAG AND tag:flagged)
  UNREAD_COUNT=$($NMC tag:$TAG AND tag:unread)

  echo "$TAG: total $TOTAL_COUNT flagged $FLAGGED_COUNT unread $UNREAD_COUNT"
}

# then these are the tags I'm interested in, so print the reports
# one line per tag
function full_report(){
  tag_report inbox
  tag_report action
  tag_report waiting
  tag_report readlater
}

# finally, pipe through column to make the results line up nicely
full_report | column -t

I’ve chosen to do this by having a small tmux window above my mail client, using watch to refresh. The command I use is:

watch -t -n 300 bin/mailreport

The -t means don’t bother showing the time (saving a line of output) and the -n 300 means only run the report once every 5 minutes. The output (currently) looks like:

inbox:      total  28  flagged  4  unread  0
action:     total  55  flagged  4  unread  11
waiting:    total  13  flagged  2  unread  1
readlater:  total  60  flagged  2  unread  3

Matching Headers With Regular Expressions Using Afew

This post is part of my notmuch of a journey series.

notmuch is somewhat limited in the choice of headers it processes, despite that being an oft requested feature. However afew parses the entire email itself so it can do what it likes.

The recently added HeaderMatchingFilter will match a regular expression you provide against an arbitrary header you specify. The text matched can be used as the tag. A simple version of the ListMailsFilter could be implemented as:

[HeaderMatchingFilter.1]
header = List-Id
pattern = <(?P<list_id>.*)>
tags = +lists +{list_id}

So if the List-Id header field is found, and it starts and ends with <> then the contents will be used as a tag, and in addition the tag list_id will be added to the message.

My initial motivation was that we use email addresses of the form:

<projectname>-team@aptivate.org

for various projects, so I wanted to use the projectname as the tag without having to write a filter for every project. With HeaderMatchingFilter I can do:

[HeaderMatchingFilter.1]
header = To
pattern = (?P<team_name>[a-z0-9]+)-team@aptivate.org
tags = +{team_name}

[HeaderMatchingFilter.2]
header = Cc
pattern = (?P<team_name>[a-z0-9]+)-team@aptivate.org
tags = +{team_name}

Also, we use redmine heavily. It adds a header saying what project the email is associated with. I can use that by doing:

[HeaderMatchingFilter.3]
header = X-Redmine-Project
pattern = (?P<project>.*)
tags = +{project}

On a related note, I have now written more documentation for afew, so hopefully the path after me will be smoother.

Initial Tagging and Afew

This post is part of my notmuch of a journey series.

I love the way gmail uses labels and I loved using them in sup. notmuch has tags for the same purpose so I want to use them extensively, and to have most tags added automatically for me.

notmuch new mail workflow

There are several ways to do this, but the way I’ve done it is:

  • offlineimap runs every X minutes
  • the offlineimap postsynchook is set to notmuch new – so when the IMAP run is complete, notmuch is called
  • notmuch new processes all the email it doesn’t know about, adding a set of tags to every new message
  • notmuch then checks for the file <maildir>/.notmuch/hooks/post-new and if it exists then it will run it
  • in my case, the hook file calls afew --new --tag

The tags added by notmuch new are set in the notmuch config file, in the part that looks like:

[new]
tags=new

So all new messages will have those tags added. Your post-new hook can then run only against mail that is tagged “new”, making it nice and fast. And as the final step, the post-new hooks should remove the “new” tag from all messages that have it, so that you don’t have to re-process them next time. A common idea is also to remove the “new” tag from messages that match filters you want to archive, and at the end of your filter list you replace the “new” tag with the “inbox” tag for any messages still tagged “new”.

You can do this with a basic shell script that calls the notmuch command line binary, as shown on the notmuch initial tagging page. However the command line binary only looks at a few headers (From, To, Cc, Subject, Date and message ID as far as I can tell). So I thought I’d look into:

afew

afew is a python command that will run filters against a set of messages and generate tags. I can install it using pip. However, at the time of writing the documentation is not as useful as it could be. I ended up reading the code to work out what was going on.

One key bit of information is that the config files are stored in ~/.config/afew/config which meant I could search github for path:afew/config – which only brought up two hits when I just tried it, but that’s a start to see what others do. The most useful one I found was by pazz and you might also want to look at my afew config.

When you’re playing with afew, running afew --help will get you started reasonably. And the command you want for the post-new hook is:

afew --tag --new

Which will run against the new messages and run the tag filters.

afew filters

I had to read the code to work out what some of the filters were, so here are my findings as a first stab at documenting them. Later I intend to improve the docs.

afew’s built in filters

The default filter set (if you don’t specify anything in the config) is:

[SpamFilter]
[ClassifyingFilter]
[KillThreadsFilter]
[ListMailsFilter]
[ArchiveSentMailsFilter]
[InboxFilter]

These can be customised by specifying settings beneath them. The standard settings are:

  • message – text that will be displayed while running this filter if the verbosity is high enough.
  • query – the query to use against the messages, specified in standard notmuch format
  • tags – the tags to add to messages that match the query
  • tag_blacklist – if the message has one of these tags, don’t add tags to it

Note that most of the filters below set their own value for message, query and/or tags, and some ignore some of the above settings.

SpamFilter

This looks for the header X-Spam-Flag – if it finds it, the spam tag is set. You can override the tag used for spam if you want.

The settings you can use are:

  • spam_tag is the tag used to identify spam. It defaults to “spam”

ClassifyingFilter

This is to do with learning what tags match what without explicit rules. I haven’t worked this one out fully, but the project README has some details on how to use it. If I work out more I’ll blog about and link to that from here.

KillThreadsFilter

If the new message has been added to a thread that has already been tagged “killed” then add the “killed” tag to this message. This allows for ignoring all replies to a particular thread.

ListMailsFilter

This filter looks for the List-Id header, and if it finds it, adds the list name as a tag, together with the tag “lists”.

ArchiveSentMailsFilter

Basically does what says it on the tin. Though more accurately, it looks for emails that are from one of your addresses and not to any of your addresses. It then adds the sent tag and removes the inbox tag.

InboxFilter

This removes the new tags, and adds the “inbox” tag to any message that isn’t killed or spam.

FolderNameFilter

This looks at which folder each email is in and uses that name as a tag for the email. So if you have a procmail or seive set up that puts emails in folders for you, this might be useful.

Adding your own filters to afew

You can modify filters, and define your own versions of the base Filter that allow you to tag messages in a similar way to the notmuch tag command, using the settings above. Showing some sample configs is the easiest way to understand. The notmuch initial tagging page shows a sample config:

# immediately archive all messages from "me"
notmuch tag -new -- tag:new and from:me@example.com

# delete all messages from a spammer:
notmuch tag +deleted -- tag:new and from:spam@spam.com

# tag all message from notmuch mailing list
notmuch tag +notmuch -- tag:new and to:notmuch@notmuchmail.org

# finally, retag all "new" messages "inbox" and "unread"
notmuch tag +inbox +unread -new -- tag:new

The equivalent in afew would be:

[ArchiveSentMailsFilter]

[Filter.spamcom]
message = Delete all messages from spammer
query = from:spam@spam.com
tags = deleted;-new

[Filter.notmuch]
message = Tag all messages from the notmuch mailing list
query = to:notmuch@notmuchmail.org
tags = notmuch

[Filter.myinbox]
message = My version of the inbox filter
query = tag:new
tags = inbox;unread;-new

Not that the queries do not include tag:new because this is implied when afew is run with the --new flag.

Here are a few more example filters from github dotfiles:

[Filter.1]
query = 'sicsa-students@sicsa.ac.uk'
tags = +sicsa
message = sicsa

[Filter.2]
query = 'from:foosoc.ed@gmail.com OR from:GT Silber OR from:lizzie.brough@eusa.ed.ac.uk'
tags = +soc;+foo
message = foosoc

[Filter.3]
query = 'folder:gmail/G+'
tags = +G+
message = gmail spam

# skip inbox
[Filter.6]
query = 'to:notmuch@notmuchmail.org AND (subject:emacs OR subject:elisp OR "(defun" OR "(setq" OR PATCH)'
tags = -inbox;-new
message = notmuch emacs stuff

If you need more powerful processing you can write filters that match regular expressions against any header in the email with the HeaderMatchingFilter, but more on that in a future blog post.

Python 2.7 on Debian Squeeze

This post is part of my notmuch of a journey series.

There were various python libraries and apps I wanted to use, and afew and alot in particular requires python 2.7

I am running all this on a machine that is running Debian Squeeze which ships with python 2.6 and a little research reveals that it would be unwise to grab python 2.7 from testing/Wheezy as it will then become the system python and break lots of things.

But then I came across pythonbrew which is inspired by the ruby rvm – it allows you to install different versions of python inside your home directory. However, doing this straight off might lead you to be missing some vital functionality related to compression or unicode (as I did the first couple of attempts). So the debian packages to install (covering all the packages I need for my notmuch journey) are:

sudo apt-get install build-essential libbz2-dev libc6-dev libexpat1-dev \
    libgcrypt11-dev libglib2.0-dev libgmime-2.4-dev libgpg-error-dev \
    libgpgme11-dev libncurses5-dev libncursesw5-dev libreadline-dev \
    libsqlite3-dev libssl-dev zlib1g-dev python-pip
sudo apt-get install -t testing libnotmuch-dev notmuch libnotmuch3

Then you can do:

pip install pythonbrew
pythonbrew install 2.7.3
pythonbrew use 2.7.3   # this starts using python 2.7 in the current shell

I also wanted to have the notmuch python bindings, so I did:

git clone git://notmuchmail.org/git/notmuch
cd notmuch/bindings/python
python setup.py install

At this point I was pretty much good to go, although I should also mention that you can save having to remember which version of python you are currently using by having wrapper scripts with contents like:

#!/bin/sh
$HOME/.pythonbrew/pythons/Python-2.7.3/bin/alot $@

Finally, I also came across a fork of pythonbrew called pythonz that I may look into further at some point. It support PyPy and Jython amongst others, but as it says in the README “[pythonbrew] has some extra features which I don’t really need, so I made this for to make something a bit simpler that works for me” so some things might be broken …

Offlineimap and Msmtp

This post is part of my notmuch of a journey series.

notmuch doesn’t get my email or send it, so I’m turning to two commonly used tools to help me with this.

  • offlineimap fetches the email over IMAP (and syncs it back)
  • msmtp sends the email for me

The set up I had is fairly standard, though a key issue was

Password Storage

I’m running on a server and connecting over ssh, as I want the same set of tags to be available everywhere. I don’t want my password sitting on disk in clear text (particularly as it is my LDAP password, so used for sudo, ssh and various other logins). There are various solutions to this if you are running a desktop – python-keyring-lib will link to Gnome, KDE and OS X keyrings. But it took a bit more searching to work out how to do this on a server.

I considered a number of options

GPG encrypted file

I found this unix stackexchage post about having a python script that could decrypt a GPG encrypted file (one file per password). gpg-agent would run on the server so you didn’t have to enter your passphrase every time you need to run it. I got this running, generating a GPG key that I wasn’t actually going to use for email. That meant I was OK setting a long time out for gpg-agent – I set it to 24 hours by putting the following in ~/.gnupg/gpg-agent.conf

pinentry-program /usr/bin/pinentry-curses
default-cache-ttl 86400
max-cache-ttl 86400

However I wanted to be able to leave offlineimap running and at some point gpg-agent would expire, so this wasn’t really what I wanted.

That said I did leave this system in place for msmtp, as that will only run when I send an email – so I will be logged in at that point and can enter my GPG passphrase into gpg-agent. The relevant line for the .msmtprc file is:

passwordeval gpg --use-agent --quiet --batch -d ~/.passwd/account.gpg

Environment variables

I found a blog post describing caching the decrypted password in environment variables which seemed quite cunning. That would only survive for the session, but then I leave tmux running all the time, so as long as tmux survived, the environment variables would. But then I thought, why not just:

Leave offlineimap running in tmux

When you run offlineimap, if there is no password or passwordeval in the .offlineimaprc file, then offlineimap will ask for your password (provided the ui isn’t set to be Noninteractive). offlineimap can also be told to run repeatedly, without exiting, by putting autorefresh = 5 to re-run every 5 minutes. In between runs, offlineimap will just sit at the terminal waiting, but it remembers your password, keeping it in memory. Using this I could leave offlineimap running for months inside tmux on the server I use.

(Note that offlineimap will not ask for your password if the ui is set to be Noninteractive. And the autorefresh setting must be in the [Account X] section – I put it in the Repository section originally which didn’t work).

ssh keys and offlineimap preauthtunnel

Finally I worked out that I could talk directly to an imap command line utility on the remote server over ssh – using a passwordless ssh key to avoid any passwords. offlineimap is instructed to use it by the preauthtunnel option.

There are two main IMAP servers in the linux world, courier and dovecot. The relevant commands are:

# Courier IMAP (Debian - you should check the path on CentOS)
preauthtunnel = ssh -o Compression=yes -q IMAPHOST '/usr/bin/imapd ./Maildir'

# Dovecot IMAP (CentOS)
preauthtunnel = ssh -o Compression=yes -q IMAPHOST 'MAIL=maildir:~/Maildir exec /usr/libexec/dovecot/imap'
# Dovecot IMAP (Debian)
preauthtunnel = ssh -o Compression=yes -q IMAPHOST 'MAIL=maildir:~/Maildir exec /usr/lib/dovecot/imap'

Replace IMAPHOST and ~/Maildir with your own values.

There are two ways to do this so that you don’t have to enter the ssh key passphrase all the time. You could set up a password-less ssh key to do this, or you could leave an ssh session running so that offlineimap can multiplex its connections on to it.

A password-less ssh key leads to the possibility of abuse, but you can generate a new ssh key just for imap and specify it’s use by modifying the above command to look like:

preauthtunnel = ssh -q -i /home/hamish/.ssh/id_mail_imap -o Compression=yes IMAPHOST '/usr/libexec/dovecot/imap'

On the mail server side you can then lock down this key by configuring what command can be run in the .ssh/authorized_keys file. Mine looks like:

command="/usr/libexec/dovecot/imap",no-X11-forwarding,no-agent-forwarding,no-port-forwarding,no-pty ssh-dss AAAAB3Nza...

Obviously this will only work for you if you have ssh access to your mail server, so it won’t be an option for all. But I’ve given some other options above, so I hope you can find something that works for you.

Other notes

sendmail over ssh

Another option for sending email to a remote server is to use the sendmail command over ssh (assuming you have ssh access to the server in question). First you need to generate a passwordless ssh key and copy it to the mail server:

ssh-keygen -t dsa -f ~/.ssh/id_mail_smtp
scp ~/.ssh/id_mail_smtp.pub smtpserver.example.org:.ssh/

Then, on the mail server, put the key in the authorized_keys file:

cat ~/.ssh/id_mail_smtp.pub >> ~/.ssh/authorized_keys

Next we edit the authorized keys file, adding the command to run to the beginning of the last line (which is currently our smtp key):

command="/usr/sbin/sendmail -bm -t -oem -oi",no-X11-forwarding,no-agent-forwarding,no-port-forwarding,no-pty ssh-dss AAA...

In this case the command is the sendmail command for exim – I’m afraid you’ll have to work out what your own sendmail command is. Finally, in your mail client, tell it to use ssh as the sendmail command. The following works for both mutt and alot:

ssh -q -i /home/hamish/.ssh/id_mail_smtp smtpserver.example.org

Now whenever your mail client wants to send email, it will start up the ssh connection and print the email to it. This will be picked up by the sendmail command on the far end and off the email will go.

ssh ControlMaster

ControlMaster is an ssh option that means you can leave an ssh connection open with a master session, so that future ssh connections reuse the master session rather than having to start an entirely new connection. This should mean quicker send and receive. One way to specify it in your ~/.ssh/config file is:

ControlMaster auto
ControlPersist 4h
ControlPath /home/username/.ssh/muxcontrol/%r@%h:%p

This means that connections will be automatically started, will stick around for up to 4 hours, and the socket for your connection to mail.example.com on port 22 with username user would be at /home/username/.ssh/muxcontrol/name@mail.example.com:22 (don’t forget to create the directory). (And the default directory is the global /tmp directory).

However a gotcha here is that the connection to the host will use whatever ssh key was used for the initial connection, not the one specified at the command line. So if offlineimap runs first, using your ssh key tied to the imap command, then when you try to send an email using sendmail over ssh, then your ssh command will run imap rather than sendmail – and your email will go nowhere.

The solution is to specify a different ControlPath for these commands. So for offlineimap you could use:

preauthtunnel = ssh -o Compression=yes -o ControlPath=/home/username/.ssh/muxcontrol/IMAP_%r@%h:%p -q IMAPHOST 'MAIL=maildir:~/Maildir exec /usr/libexec/dovecot/imap'

and for sending:

ssh -q -o ControlPath=/home/username/.ssh/muxcontrol/SMTP_%r@%h:%p -i /home/hamish/.ssh/id_mail_smtp smtpserver.example.org

This will keep more ssh connections active, but the right action will be done using your different ssh keys.

Checking IMAP mailboxes

I was moving folders around and was having trouble working out whether the folders were available but offlineimap was failing to find them, or if dovecot wasn’t showing me the folders. Eventually I found how to ask for a folder list over telnet. In short:

telnet imap.example.org 143
A login username password
B list "" *
C logout

(include the A, B, C). Or to do this over ssl you could connect with:

openssl s_client -connect mail.aptivate.org:993

and then run the above commands.

Other ways of sending to multiple accounts

In my research, I also found that you can set up postfix to use different servers for different accounts (though I’m not sure how well it would handle encrypted passwords) and I may improve my msmtp set up by using msmtp queue at some point, and see how well it deals with encrypted passwords.

Notmuch of a Journey

I recently decided to switch to using notmuch as a mail indexer. I’ve been a long time user of sup and love the command line interface, gmail-like threads and awesome search. However it can often be a bit cranky and slow, and it “doesn’t play well with others” – if the mbox files it uses are modified (say by a second imap client) then it needs to rebuild it’s index. I could live with all those and used it for my work email for about three years.

The real nail in the coffin is that it has mostly died as a project (though there have been a few new people stepping up recently). I have made quite a few commits to it over the years, and it gets the occasional bug fix, but the mailing list has dropped off to nothing over the years. There was some excitement when the original author of sup, William Morgan, announced heliotrope and turnsole as a client server version, but it never really gained traction.

So I decided to investigate notmuch – which is basically a C program and library which takes your email and gives it to Xapian (a full text search database – also used by sup under the hood) and then runs queries against it. But notmuch does … not much else, so I also need to fetch email, add the new emails to notmuch, run a mail client, send email and look up email addresses.

It’s taken a lot of research and playing with various tools. Some of those tools were great, but poorly documented, so part of the point of these blog posts is to give some info for other people to find. And I might even get round to adding to the documentation for some of these projects.

So here are the parts – I’ve put up a post about each major part, and all the small bits and links are part of this post.

Two copies of notmuch

I run two copies of notmuch – one for my gmail account and one for my work one. I want to keep them separate, so I have set up a parallel config. The default file is ~/.notmuch-config and I have added ~/.notmuch-config-work You can tell notmuch to use it via an enviroment variable – $NOTMUCH_CONFIG – but I don’t want to set that all the time, so I have a file ~/bin/notmuchwork with the contents:

NOTMUCH_CONFIG=~/.notmuch-config-work notmuch $@

and then have an alias set up in my .bashrc:

alias nmw=$HOME/bin/notmuchwork

I have similar set up for alot and afew – although for those you can define the config path on the command line rather than through an environment variable. This is necessary as they access notmuch through the python library rather than through the command line.

My dotfiles

My dotfiles are on github have all of my config – have fun poking around.