

                  EXIM'S USER INTERFACE TO MAIL FILTERING


Exim is a mail transfer agent for Unix systems. This document describes the
user interface to its inbuilt mail filtering facility, and is copyright (c)
University of Cambridge 1997.
___________________________________________________________________________


1. Introduction

Most Unix mail transport agents (programs that deliver mail) permit
individual users to specify automatic forwarding of their mail, usually by
placing a list of forwarding addresses in a file called .forward in their
home directories. Exim extends this facility by allowing the forwarding
instructions to be a set of rules rather than just a list of addresses, in
effect providing '.forward with conditions'. Operating the set of rules is
called filtering, and the file that contains them is called a filter file.

The ability to use filtering has to be enabled by the system administrator,
and some of the individual facilities can be separately enabled or
disabled. A local document should be provided to describe exactly what has
been enabled. In the absence of this, consult your system administrator.

It is important to realize that no deliveries are actually made while a
filter file is being processed. The result of filtering is a list of
destinations to which a message should be delivered - the deliveries
themselves take place later, along with all other deliveries for the
message. This means that it is not possible to test for successful
deliveries while filtering. It also means that duplicate addresses gener-
ated by filtering are dropped, as with any other duplicate addresses.

This document describes how to use a filter file and the format of its
contents. It is intended for use by end-users. How the system administrator
can set up and control the use of filtering is described in the full Exim
specification.


2. Testing a new filter file

Filter files, especially the more complicated ones, should always be
tested, as it is easy to make mistakes. Exim provides a facility for
preliminary testing of a filter file before installing it. This tests the
syntax of the file and its basic operation, and can also be used with
ordinary .forward files.

Because a filter can do tests on the content of messages, a test message is
required. Suppose you have a new filter file called "new-filter" and a test
message called "test-message". Assuming that Exim is installed with the
conventional path name /usr/lib/sendmail, the following command can be
used:

  /usr/lib/sendmail -bf new-filter <test-message

The -bf option tells Exim that the following file is a filter to be tested,
and the test message is supplied on the standard input. If there are no
message-dependent tests in the filter, then an empty file can be used. A
supplied message must start with header lines or the 'From' message
separator line which is found in many multi-message folder files. Note that
blank lines at the start terminate the header lines. A warning is given if
no headers are read.

The result of running this command, provided no errors are detected, is a
list of the actions that Exim would try to take if presented with the
message for real. For example, the output

  Deliver message to: gulliver@lilliput.fict.book
  Save message to: /home/lemuel/mail/archive

means that one copy of the message would be sent to
gulliver@lilliput.fict.book, and another would be added to the file
/home/lemuel/mail/archive.

The actions themselves are not attempted while testing a filter file in
this way; there is no check, for example, that any forwarding addresses are
valid. If you want to know why a particular action is being taken, add the
-v option to the command. This causes Exim to output the results of any
conditional tests and to indent its output according to the depth if
nesting of if commands. Further additional output from a filter test can be
generated by the testprint command, which is described below.

When testing a filter in this way, Exim makes up an 'envelope' for the
message. The recipient is by default the user running the command, and so
is the sender, unless the command is run with the -f option to supply a
different sender. For example,

/usr/lib/sendmail -bf new-filter -f islington@neverwhere <test-message

Alternatively, if the first line of the supplied message is a 'From'
separator from a message folder file (not the same thing as a "From:"
header line), the sender is taken from there, overriding the -f option. The
'return path' is the same as the envelope sender, unless the message
contains a "Return-path:" header, in which case it is taken from there. You
need not worry about any of this unless you want to test out features of a
filter file that rely on the sender address or the return path.

It is possible to change the envelope by specifying further options. The
-bfd option changes the domain of the recipient address, while the -bfl
option changes the 'local part', that is, the part before the @ sign. An
adviser could make use of these to test someone else's filter file.

The -bfp and -bfs options specify the prefix or suffix for the local part.
These are relevant only when support for multiple personal mailboxes is
implemented; see the description in section 13 below.


3. Installing a filter file

A filter file is normally installed under the name .forward in your home
directory - it is distinguished from a conventional .forward file by its
first line (described below). However, the file name is configurable, and
some system administrators may choose to use some different name or
location for filter files.


4. Testing an installed filter file

Testing a filter file before installation cannot find every potential
problem; for example, it does not actually run commands to which messages
are piped. Some 'live' tests should therefore also be done once a filter is
installed.

If at all possible, test your filter file by sending messages from some
OTHER account. If you send a message to yourself from the filtered account,
and delivery fails, the error message will be sent back to the same
account, which may cause another delivery failure. It won't cause an
infinite sequence of such messages, because delivery failure messages do
not themselves generate further messages. However, it does mean that the
failure won't be returned to you, and also that the postmaster will have to
investigate the stuck message.

If you have to test a filter from the same account, then a sensible
precaution is to include the line

  if error_message then finish endif

as the first filter command, at least while testing. This causes filtering
to be abandoned for a delivery failure message, and since no destinations
are generated, the message goes on to get delivered to the original
address.


5. Format of filter files

Apart from leading white space, the first text in a filter file must be

  # Exim filter

This is what distinguishes it from a conventional .forward file. If the
file does not have this initial line it is treated as a conventional
.forward file, both when delivering mail and when using the -bf testing
mechanism. The white space in the line is optional, and any capitalization
may be used. Further text on the same line is treated as a comment. For
example, you could have

  #   Exim filter   <<== do not edit or remove this line!

The remainder of the file is a sequence of filtering commands, which
consist of keywords and data values, separated by white space and or line
breaks, except in the case of conditions for the "if" command, where round
brackets (parentheses) also act as separators. For example, in the command

  deliver gulliver@lilliput.fict.book

the keyword is "deliver" and the data value is
"gulliver@lilliput.fict.book". The commands are in free format, and there
are no special terminators. If the character # follows a separator, then
everything from # up to the next newline is ignored. This provides a way of
including comments in a filter file.

There are two ways in which data values can be input:

 .   If the text contains no white space then it can be typed verbatim.
     However, if it is part of a condition, it must also be free of round
     brackets (parentheses), as these are used for grouping in conditions.

 .   Otherwise it must be enclosed in double quotation marks. In this case,
     the character \ (backslash) is treated as an 'escape character' within
     the string, causing the following character or characters to be
     treated specially:

       \n      is replaced by a newline
       \r      is replaced by a carriage return
       \t      is replaced by a tab

     Backslash followed by up to three octal digits is replaced by the
     character specified by those digits, and \x followed by up to two
     hexacimal digits is treated similarly. Backslash followed by any other
     character is replaced by the second character, so that in particular,
     \" becomes " and \\ becomes \.

In addition to the escape character processing that occurs when strings are
enclosed in quotes, most data values are also subject to string expansion
(as described in the next section), in which case the character $ is also
significant.


6. String expansion

Most data values are expanded before use. Expansion consists of replacing
substrings beginning with $ with other text. The full expansion facilities
are described in section 17 below, but the most common case is the
substitution of a simple variable. For example, the substring

  $reply_address

is replaced by the address to which replies to the message should be sent.
If such a variable name is followed by a letter or digit or underscore, it
must be enclosed in curly brackets (braces), for example,

  ${reply_address}

The variables most likely to be useful in filter files are:

home: The user's home directory.

local_part: The part of the email address that precedes the @ sign -
normally the user's login name. If support for multiple personal mailboxes
is enabled (see section 13 below) and a prefix or suffix for the local part
was recognized, it is removed from the string in this variable.

local_part_prefix: If support for multiple personal mailboxes is enabled
(see section 13 below), and a local part prefix was recognized, then this
variable contains the prefix. Otherwise it contains an empty string.

local_part_suffix: If support for multiple personal mailboxes is enabled
(see section 13 below), and a local part suffix was recognized, then this
variable contains the suffix. Otherwise it contains an empty string.

message_body: The initial portion of the body of the message. By default,
up to 500 characters are read into this variable, but the system
administrator can configure this to some other value. Newlines in the body
are converted into single spaces.

message_id: The message's local identification string, which is unique for
each message handled by a single host.

message_size: The size of the message, in bytes.

original_local_part: When a top-level address is being processed, this
contains the same value as local_part. However, if an address generated by
an alias, forward, or filter file is being processed, this variable
contains the local part of the original address.

reply_address: The address from the "Reply-to:" header, if the message has
one; otherwise the address from the "From:" header. It is the address to
which normal replies to the message should be sent.

return_path: The return path - that is, the sender field that is sent as
part of the message's envelope, and which is the address to which delivery
errors are sent. In many cases, this has the same value as sender_address,
but if, for example, an incoming message to a mailing list has been
expanded, then return_path may contain the address of the list maintainer
instead.

sender_address: The sender address that was received with the message.

tod_full: A full version of the time and date, for example: Wed, 18 Oct
1995 09:51:40 +0100. The timezone is always given as a numerical offset
from GMT.

tod_log: The time and date in the format used for writing Exim's log files,
for example: 1995-10-12 15:32:29.

In addition to these 'ordinary' variables, there is a special set of
variables containing the headers of the message being processed. These
variables have names beginning with "$header_" followed by the name of the
header, terminated by a colon. The whole item, including the terminating
colon, is replaced by the contents of the message header. If there is more
than one header with the same name, their contents are concatenated, with a
single newline character between them. For example,

  $header_from:
  $header_subject:

The capitalization of the name following "$header_" is not significant.
Because any printing character except colon may appear in the name of a
message's header (this is a requirement of RFC 822, the document that
describes the format of a mail message) curly brackets must not be used in
this case, as they will be taken as part of the header name. Two shortcuts
are allowed in naming header variables:

 .   The initiating "$header_" can be abbreviated to "$h_".

 .   The terminating colon can be omitted if the next character is white
     space. The white space character is retained in the expanded string.

If the message does not contain a header of the given name, an empty string
is substituted. Thus it is important to spell the names of headers
correctly. Do not use "$header_Reply_to" when you really mean
"$header_Reply-to".


7. Significant deliveries

When in the course of delivery a message is processed by a filter file,
what happens next depends on whether the filter has set up any significant
deliveries or not. If there is at least one significant delivery, then the
filter is considered to have handled the entire delivery for the current
address, and no further deliveries are done. If, however, no significant
deliveries have been set up, Exim continues processing the current address
as if there were no filter file, and typically delivers it into a local
mailbox. In particular, this happens in the special case of a filter file
containing only comments.

The delivery commands described in the next section are by default
significant. However, if such a command is preceded by the word "unseen",
then its delivery is not considered to be significant. In contrast, other
commands such as "mail" and "vacation" do not count as significant
deliveries unless preceded by the word "seen".


8. Delivery commands

There are three commands that cause a copy of the message to be
transmitted:

       deliver <mail address>
  e.g. deliver "Dr Livingstone <David@somewhere.africa>"

This is a mail forwarding operation. The message is sent on to the given
address, exactly as happens if the address had appeared in a traditional
.forward file. To deliver a copy of the message to your normal mailbox,
your login name can be given. Once a message has been processed by the
filtering mechanism, it will not be so processed again, so doing this does
not cause a loop. However, if you have a mail alias, you should not refer
to it here. For example, if the mail address "L.Gulliver" is aliased to
"lg103" then all references in Gulliver's .forward file should be to
"lg103". A reference to the alias will not work for messages that are
addressed to that alias, since, like .forward file processing, aliasing is
performed only once on an address, in order to avoid looping.

       save <file name>
  e.g. save $home/mail/bookfolder

This causes a copy of the message to be appended to the given file (that
is, mail folder). If the name does not start with a / character, then the
contents of the $home variable are prepended. The user must of course have
permission to write to the file. In addition, the ability to use this
command is controlled by the system administrator - it may be forbidden on
some systems. An optional mode value may be given after the file name, for
example,

       save /some/folder 0640

This makes it possible for users to override the system-wide mode setting
for file deliveries, which is normally 0600. If an existing file does not
have the correct mode, it is changed.

       pipe <command>
  e.g. pipe "$home/bin/countmail $sender_address"

This command causes a separate process to be run, and a copy of the message
is passed on its standard input. The command supplied to pipe is split up
by Exim into a command name and a number of arguments, delimited by white
space except for arguments enclosed in double quotes, in which case
backslash is interpreted as an escape, or in single quotes, in which case
no escaping is recognized. Note that as the whole command is normally
supplied in double quotes, a second level of quoting is required for
internal double quotes. For example:

       pipe "$home/myscript \"size is $message_size\""

String expansion is performed on the separate components after the line has
been split up. Therefore substitution cannot change the number of argu-
ments, nor can quotes and backslashes in variables cause confusion. The
command is run directly by Exim; it is not run under a shell.

The default PATH set up for the command is determined by the system
administrator, usually containing at least /usr/bin so that common commands
are available without having to specify an absolute file name. However, it
is possible for the system administrator to restrict the pipe facility so
that the command name must not contain any / characters, and must be found
in one of the directories in the configured PATH. It is also possible for
the system administrator to lock out the use of the pipe command
altogether.

When the command is run, the following environment variables are set up:

  DOMAIN               the local domain of the address
  HOME                 your home directory
  LOCAL_PART           your login name
  LOGNAME              your login name
  MESSAGE_ID           the message's unique id
  PATH                 the command search path
  SENDER               the sender of the message
  SHELL                /bin/sh
  USER                 your login name

If you run a command that is a shell script, be very careful in your use of
data from the incoming message in the commands in your script. RFC 822 is
very generous in the characters that are legally permitted to appear in
mail addresses, and in particular, an address may begin with a vertical bar
or a slash. For this reason you should always use quotes round any
arguments that involve data from the message, like this:

  /some/command "$SENDER"

so that inserted shell meta-characters do not cause unwanted effects.


9. Mail commands

There are two commands which cause the creation of a new mail message,
which does not count as a significant delivery unless the command is
preceded by the word "seen". This is a powerful facility, but it should be
used with care, because of the danger of creating infinite sequences of
messages. The system administrator can forbid the use of these commands
altogether.

To help prevent runaway message sequences, these commands have no effect
when the incoming message is a delivery error message, and messages sent by
this means are treated as if they were reporting delivery errors. Thus they
should never themselves cause a delivery error message to be returned. The
basic mail-sending command is

       mail [to <address-list>]
            [cc <address-list>]
            [bcc <address-list>]
            [subject <text>]
            [text <text>]
            [[expand] file <filename>]
            [log <log file name>]
            [once <note file name>]

  e.g. mail text "Your message re $h_subject received"

As a convenience for use in one common case, there is also a command called
vacation. It behaves in the same way as mail, except that the defaults for
the "file", "log", and "once" options are

  expand file .vacation.msg
         log  .vacation.log
         once .vacation

respectively. This mimics the behaviour of the traditional Unix vacation
command. If a file name is given to "vacation", it is expanded only if
explicitly requested.

The key/value argument pairs can appear in any order. At least one of
"text" or "file" must appear (except with "vacation"); if both are present,
the text string appears first in the message. If "expand" precedes "file",
then each line of the file is subject to string expansion as it is included
in the message.

If no "to" argument appears, the message is sent to the address in the
"Reply_address" variable (see section 6 above). An "In-Reply-To:" header is
automatically included in the created message, giving a reference to the
message identification of the incoming message.

If a log file is specified, a line is added to it for each message sent. If
a "once" file is specified, it is used to create a database for remembering
who has received a message, and no more than one message is ever sent to
any particular address. The file name specified for "once" is used as the
base name for direct-access (DBM) file operations. On some operating
systems this results in two files being created, with the suffixes ".dir"
and ".pag" being added to the given name. On other systems a single file
with the suffix ".db" is used, while on some systems the name is used
unchanged.


10. Logging commands

A log can be kept of actions taken by a filter file. This facility is
normally available in conventional configurations, but there are some
situations where it might not be. Also, the system administrator may choose
to disable it. Check your local information if in doubt.

Logging takes place while the filter file is being interpreted. It does not
queue up for later like the delivery commands. The reason for this is so
that a log file need be opened only once for several write operations.
There are two commands, neither of which constitutes a significant
delivery.

       logfile <file name>
  e.g. logfile $home/filter.log

This defines a file to which logging output is subsequently written. The
file name may optionally be followed by a mode for the file, which is used
if the file has to be created. For example,

       logfile $home/filter.log 0644

The default for the mode is 0600. It is suggested that the logfile command
normally appear as the first command in a filter file. Once logfile has
been obeyed, the log command can be used to write to the log file. It is
possible to have more than one logfile command, to specify writing to
different log files in different circumstances.

       log "<some text string>"
  e.g. log "$tod_log $message_id processed"

Writing takes place at the end of the file, and a newline character is
added to the end of each string if there isn't one already there. Newlines
can be put in the middle of the string by using the \n escape sequence.
Lines from simultaneous deliveries may get interleaved in the file, as
there is no interlocking, so you should plan your logging with this in
mind. However, data should not get lost.


11. Other commands

The command "finish", which has no arguments, causes Exim to stop
interpreting the filter file. This is not a significant action unless
preceded by "seen". A filter file containing only "seen finish" is a black
hole.

It is sometimes helpful to be able to print out the values of variables
when testing filter files. The command

       testprint <text>
  e.g. testprint "home=$home reply_address=$reply_address"

does nothing when mail is being delivered. However, when the filtering code
is being tested by means of the -bf option, the value of the string is
written to the standard output.


12. Obeying commands conditionally

Most of the power of filtering comes from the ability to test conditions
and obey different commands depending on the outcome. The "if" command is
used to specify conditional execution, and its general form is

  if    <condition>
  then  <commands>
  elif  <condition>
  then  <commands>
  else  <commands>
  endif

There may be any number of "elif"-"then" sections (including none) and the
"else" section is also optional. Any number of commands, including nested
"if" commands, may appear in any of the <commands> sections.

Conditions can be combined by using the words "and" and "or", and round
brackets (parentheses) can be used to specify how several conditions are to
combine. Without brackets, "and" is more binding than "or". A condition can
be preceded by "not" to negate it, and there are also some negative forms
of condition that are more English-like.

There are three conditions that operate on text strings, using the words
'is', 'contains' and 'matches':

       <text1> is <text2>
       <text1> is not <text2>
  e.g. $local_part_suffix is "-foo"

An 'is' test does an exact match between the strings, without regard to the
case of letters, having first expanded both strings.

       <text1> contains <text2>
       <text1> does not contain <text2>
  e.g. $header_subject: contains "evolution"

A 'contains' test does a partial string match without regard to the case of
letters, having expanded both strings.

       <text1> matches <text2>
       <text2> does not match <text2>
  e.g. $sender_address matches "Bill|John"

For a 'matches' test, after expansion of both strings, the second one is
interpreted as a regular expression, but the matching is done independent
of case. The syntax of regular expressions supported by Exim is described
in section 16 below. Note that if you need a backslash in the expression
and it is quoted, you must use \\ because the string is subject to normal
Exim escape processing. Note also that since the regular expression string
is expanded, if you need a $ in the expression, it must be escaped, and
because of the previous comment, you need to use \\$, not just \$ if the
string is in quotes.

If the regular expression contains bracketed subexpressions, then numeric    |
variables such as $1 can be used in the subsequent actions after a           |
successful match. If the match fails, the values of the numeric variables    |
remain unchanged. Previous values are not restored after "endif" - in other  |
words, only one set of values is ever available. If the condition contains   |
several sub-conditions connected by "and" or "or", it is the sub-            |
expressions from the last successful match that are available in subsequent  |
actions. Numeric variables from any one sub-condition are also available     |
for use in subsequent sub-conditions, since string expansion of a condition  |
occurs just before the it is tested.                                         |

The following conditions are available for performing numerical tests:

       <number1> above <number1>
       <number1> is not above <number1>
       <number1> below <number1>
       <number1> is not below <number1>
  e.g. $message_size is not above 10k

The <number> arguments must expand to strings of digits, optionally
followed by one of the letters K or M (in either case) which cause
multiplication by 1024 and 1024x1024 respectively.

A common requirement is to distinguish between incoming personal mail and
mail from a mailing list. The condition

       personal

is a shorthand for

       $header_to: contains $local_part@$domain and
       $header_from: does not contain $local_part@$domain and
       $header_from: does not contain server@ and
       $header_from: does not contain daemon@ and
       $header_from: does not contain root@ and
       $header_subject: does not contain "circular" and
       $header_precedence: does not contain "bulk"

The variable "local_part" contains the local part of the mail address of
the user whose filter file is being run - it is normally your login id. The
"domain" variable contains the mail domain. This condition tests for the
appearance of the current user in the "To:" header, checks that the sender
is not the current user or one of a number of common daemons, and checks
the content of the "Subject:" and "Precedence:" headers.

If the system is configured to rewrite local parts of mail addresses, for
example, to rewrite 'dag46' as 'Dirk.Gently', then the rewritten form of
the address is also used in the tests.

It is quite common for people who have mail accounts on a number of
different systems to forward all their mail to one system, and in this case
a check for personal mail should test all their various mail addresses. To
allow for this, the personal condition keyword can be followed by

  alias <address>

any number of times, for example

  personal alias smith@else.where alias jones@other.place

Whether or not any previously obeyed filter commands have resulted in
significant actions can be tested by the condition "delivered", for
example:

  if not delivered then save mail/anomalous endif

Finally, the condition "error_message" is true if the incoming message is a
mail delivery error message. Putting the command

  if error_message then finish endif

at the head of your filter file is a useful insurance against things going
wrong in such a way that you cannot receive delivery error reports.


13. Multiple personal mailboxes

The system administrator can configure Exim so that users can set up
variants on their email addresses and handle them separately. Consult your
system administrator or local documentation to see if this facility is
enabled on your system, and if so, what the details are.

The facility involves the use of a prefix or a suffix on an email address.
For example, all mail addressed to lg103-<something> would be the property
of user lg103, who could determine how it was to be handled, depending on
the value of <something>.

There are two possible ways in which this can be set up. The first
possibility is the use of multiple .forward files. In this case, mail to
lg103-foo, for example, is handled by looking for a file called .forward-
foo in lg103's home directory. If such a file does not exist, delivery
fails and the message is returned to its sender.

The alternative approach is to pass all messages through a single .forward
file, which must be a filter file in order to distinguish between the
different cases by referencing the variables local_part_prefix or
local_part_suffix, as in the final example in section 15 below. If the
filter file does not handle a prefixed or suffixed message, delivery fails
and the message is returned to its sender.

It is possible to configure Exim to support both schemes at once. In this
case, a specific .forward-foo file is first sought; if it is not found, the
basic .forward file is used.


14. Ignoring delivery errors

As was explained above, filtering just sets up addresses for delivery - no
deliveries are actually done while a filter file is active. If any of the
generated addresses subsequently suffers a delivery failure, an error
message is generated in the normal way. However, if the filter command
which sets up a delivery is preceded by the word "noerror", then errors for
that delivery, and any deliveries consequent on it (that is, from alias,
forwarding, or filter files it invokes) are ignored.


15. Examples of filter commands

Simple forwarding:

  # Exim filter
  deliver baggins@rivendell.middle.earth

Vacation handling using traditional means:

  # Exim filter
  unseen pipe "/usr/ucb/vacation \"$local_part\""

Vacation handling inside Exim:

  # Exim filter
  if personal then vacation endif

File some messages by subject:

  # Exim filter
  if $header_subject: contains "empire" or
     $header_subject: contains "foundation"
  then
     save $home/mail/f&e
  endif

Save all non-urgent messages by weekday:

  # Exim filter
  if $header_subject: does not contain "urgent" and
     $tod_full matches "^(...),"
  then
    save $home/mail/$1
  endif

Throw away all mail from one site, except from postmaster:

  # Exim filter
  if $reply_address contains "@spam.site" and
     $reply_address does not contain "postmaster@"
  then
     seen finish
  endif

Handle multiple personal mailboxes

  # Exim filter
  if $local_part_suffix is "-foo"
  then
    save $home/mail/foo
  elif $local_part_suffix is "-bar"
  then
    save $home/mail/bar
  endif


16. Regular expressions

Exim uses Henry Spencer's freely distributable regular expression library.
The syntax of the regular expressions that it supports is as follows:

A regular expression is zero or more branches, separated by '|'. It matches
anything that matches one of the branches. A branch is zero or more pieces,
concatenated. It matches a match for the first, followed by a match for the
second, etc. A piece is an atom possibly followed by '*', '+', or '?'.

An atom followed by '*' matches a sequence of 0 or more matches of the
atom. An atom followed by '+' matches a sequence of 1 or more matches of
the atom. An atom followed by '?' matches a match of the atom, or the null
string.

An atom is a regular expression in parentheses (matching a match for the
regular expression), a range (see below), '.' (matching any single charac-
ter), '^' (matching the null string at the beginning of the input string),
'$' (matching the null string at the end of the input string), a '\'         |
followed by a single character other than '>' or '<' (matching that          |
character), the sequence '\>' (matching the null string preceding a letter,  |
digit, or underscore that is preceded by the start of the string or a non-   |
letter, non-digit, non-underscore), the sequence '\<' (matching the null     |
string preceding a non-letter, non-digit, or non-underscore), or a single    |
character with no other significance (matching that character).

A range is a sequence of characters enclosed in '[]'. It normally matches
any single character from the sequence. If the sequence begins with '^', it
matches any single character not from the rest of the sequence. If two
characters in the sequence are separated by '-', this is shorthand for the
full list of ASCII characters between them (e.g. '[0-9]' matches any
decimal digit). To include a literal ']' in the sequence, make it the first
character (following a possible '^'). To include a literal '-', make it the
first or last character.


17. More about string expansion

The description which follows in the next section is an excerpt from the
full specification of Exim, except that it lists only those expansion
variables that are likely to be useful in filter files.

Expanded strings are copied verbatim except when a dollar character is
encountered. This specifies the start of a portion of the string which is
interpreted and replaced as described below.

An uninterpreted dollar can be included in the string by putting a
backslash in front of it - if the string appears in quotes, two backslashes
are required because the quotes themselves cause some interpretation when
the string is read in. A backslash can in fact be used to prevent any
character being treated specially in an expansion.


18. Expansion items

The following items are recognized in expanded strings. White space may be
used between sub-items that are keywords or sub-strings enclosed in braces
inside an outer set of braces, to improve readability.

$<variable name> or ${<variable name>}

   Substitute the contents of the named variable; the latter form can be
   used to separate the name from subsequent alphameric characters. The
   names of the variables are given in section 21 below. If the name of a
   non-existent variable is given, the expansion fails.

$header_<header name>: or $h_<header name>:

   Substitute the contents of the named message header, for example

     $header_reply-to:

   This particular expansion is intended mainly for use in users' filter
   files. The header names follow the syntax of RFC 822, which states that
   they may contain any printing characters except space and colon.
   Consequently, curly brackets do not terminate header names. Upper and
   lower case letters are synonymous in header names. If the following
   character is white space, the terminating colon may be omitted. If the
   message does not contain the given header, the expansion item is
   replaced by an empty string. If there is more than one header with the
   same name, they are all concatenated to form the substitution string,
   with a newline character between each of them.

${<op>:<string>}

   The string is first itself expanded, and then the operation specified by
   <op> is applied to it. A list of operators is given in section 19 below.
   The string starts with the first character after the colon, which may be
   leading white space.

${if <condition> {<string1>}{<string2>}}

   If <condition> is true, <string1> is expanded and replaces the whole
   item; otherwise <string2> is used. The second string need not be
   present; if it is not and the condition is not true, the item is
   replaced with nothing. Alternatively, the word 'fail' may be present
   instead of the second string (without any curly brackets). In this case,
   the expansion fails if the condition is not true. The available
   conditions are described in section 20 below.

${lookup{<key>} <search type> {<file>} {<string1>} {<string2>}}

${lookup{<search type> {<query>} {<string1>} {<string2>}}

   These items specify data lookups in files and databases, as discussed in
   chapter 6 of the main Exim specification. The first form is used for
   single-key lookups, where 'partial-' is permitted to precede the search
   type in order to do partial matching, while the second is used for
   query-style lookups.

   If the lookup succeeds, then <string1> is expanded and replaces the
   entire item. During its expansion, a variable called value is available,
   containing the data returned by the file lookup. If the lookup fails,
   <string2> is expanded and replaces the entire item. It may be omitted,
   in which case the replacement is null.

   Instead of {<string2>} the word 'fail' can appear, and in this case, if
   the lookup fails, the entire string expansion fails in a way that can be
   detected by the caller. The consequences of this depend on the
   circumstances.

   The <key>, <file>, and <query> strings are expanded before use. For
   single-key lookups the search type must be one of

     dbm       do a DBM lookup
     lsearch   do a linear search
     nis       search a NIS map
     nis0      ditto, with trailing zero on the key

   For a linear search, a line beginning with the key followed by a colon
   is searched for, and the data is the remainder of the line and any
   continuations, in the format of an alias file. For a NIS search, <file>
   is the name of the NIS map. The 'nis0' form is required for Sun alias
   files. This example looks up the postmaster alias in the conventional
   alias file:

     ${lookup {postmaster} lsearch {/etc/aliases} {$value}}

   For query-style lookups the only available search type is 'nisplus', and
   the query must be a NIS+ indexed name, optionally followed by a colon
   and the name of a field to return. For example,

     "${lookup nisplus {[name=$local_part],passwd.org_dir:gcos} \
       {$value}fail}"

   looks up the full name of the user corresponding to the local part of an
   address, failing the expansion if it is not found.

${lookup{<key:subkey>} <search type> {<file>} {<string1>} {<string2>}}

   This searches for <key> in the file as described above for single-key
   lookups; if it succeeds, it extracts from the data a subfield which is
   identified by the <subkey>. The data related to the main key must be of
   the form:

     <subkey1> = <value1>  <subkey2> = <value2> ...

   where the equals signs are optional. If any of the values contain white
   space, they must be enclosed in double quotes, and any values that are
   enclosed in double quotes are subject to escape processing as described
   in section 5. For example, if a line in a linearly searched file
   contains

     alice: uid=1984 gid=2001

   then expanding the string

     ${lookup{alice:uid}lsearch{<file name>}{$value}}

   yields the string '1984'. If the subkey is not found in <string1>, then
   <string2> is expanded and replaces the entire item.

${extract{<key>}{<string>}}

   The key and the string are first expanded. Then the subfield identified
   by the key is extracted from the string, exactly as just described for
   lookup items with subkeys. If the key is not found in the string, the
   item is replaced by nothing.


19. Expansion operators

A string can be forced into lower case by the lc operator, for example

  ${lc:$local_part}

The length operator can be used to extract the initial portion of a string.
It is followed by an underscore and the number of characters required. For
example

  ${length_50:$message_body}

The result of this operator is either the first n characters or the whole
string, whichever is the shorter. The abbreviation l can be used instead of
length.

The substr operator can be used to extract more general substrings. It is    |
followed by an underscore and the starting offset, then a second underscore  |
and the length required. For example                                         |
                                                                             |
  ${substr_3_2:$local_part}                                                  |
                                                                             |
If the starting offset is greater than the string length the result is the   |
null string; if the length plus starting offset is greater than the string   |
length, the result is the right-hand part of the string, starting from the   |
given offset. The first character in the string has offset 0. The            |
abbreviation s can be used instead of substr.                                |

The expand operator causes a string to be expanded for a second time. For
example,

  ${expand:${lookup{$domain}dbm{/some/file}{$value}}}

first looks up a string in a file while expanding the operand for expand,
and then re-expands what it has found.


20. Expansion conditions

The following conditions are available for testing while expanding strings:

  !<condition>

This negates the result of the condition.

  def:<variable>

This condition is true if the named expansion variable does not contain the
empty string. If the variable does not exist, the expansion fails.

  exists{<file name>}

The substring is first expanded and then interpreted as an absolute path.
The condition is true if the named file (or directory) exists. The
existence test is done by calling the stat() function.

  eq {string1}{string2}

The two substrings are first expanded. The condition is true if the two
resulting strings are identical, including the case of letters.

  match {string1}{string2}                                                   |
                                                                             |
The two substrings are first expanded. The second is then treated as a       |
regular expression and applied to the first. The condition is true if the    |
match succeeds. At the start of an "if" expansion the values of the numeric  |
expansion variables $1 etc. are remembered. Obeying a "match" condition      |
that succeeds causes them to be reset to the substrings of that condition    |
and they will have these values during the expansion of the success string.  |
At the end of the "if" expansion, the previous values are restored. After    |
testing a combination of conditions using "or", the subsequent values of     |
the numeric variables are be those of the condition that succeeded.          |
                                                                             |
  or {{cond1}{cond2}...}                                                     |
                                                                             |
The sub-conditions are evaluated from left to right. The condition is true   |
if any one of the sub-conditions is true. When a true sub-condition is       |
found, the following ones are parsed but not evaluated. Thus if there are    |
several 'match' sub-conditions the values of the numeric variables are       |
taken from the first one that succeeds.                                      |
                                                                             |
  and {{cond1}{cond2}...}                                                    |
                                                                             |
The sub-conditions are evaluated from left to right. The condition is true   |
if all of the sub-conditions are true. When a false sub-condition is found,  |
the following ones are parsed but not evaluated.                             |


21. Expansion variables

This list of expansion variables contains those that are likely to be of
use in filter files. Others that are not relevant at filtering time, or are
of interest only to the system administrator, are omitted.

domain: When an address is being directed, routed, or a local delivery is
taking place, this variable contains the domain. In particular, it is set
during filtering.

home: This is set to the user's home directory when user filtering is
configured in the normal way. When running a filter test via the -bf
option, home is set to the value of the environment variable HOME.

local_part: When an address is being directed, routed, or delivered
locally, this variable contains the local part. If a local part prefix or
suffix has been recognized, it is not included in the value of this
variable.

local_part_prefix: When an address is being directed or delivered locally,
and a specific prefix for the local part was recognized, it is available in
this variable. Otherwise it is empty.

local_part_suffix: When an address is being directed or delivered locally,
and a specific suffix for the local part was recognized, it is available in
this variable. Otherwise it is empty.

key: When a domain list is being searched, this variable contains the value
of the key, so that it can be inserted into strings for query-style
lookups. See chapter 6 of the main Exim specification for details. In other
circumstances this variable is empty.

message_body: This variable contains the initial portion of a message's
body while it is being delivered, and is intended mainly for use in filter
files. The maximum number of characters of the body that are used is set by
the message_body_visible configuration option; the default is 500.

message_id: When a message is being received or delivered, this variable
contains the unique message id which is used by Exim to identify the
message.

message_precedence: When a message is being delivered, the value of any      |
"Precedence" header is made available in this variable. If there is no such  |
header, the value is the null string.                                        |

message_size: When a message is being received or delivered, this variable
contains its size in bytes.

original_domain: When a top-level address is being processed, this contains
the same value as domain. However, if an address generated by an alias,
forward, or filter file is being processed, this variable contains the
domain of the original address.

original_local_part: When a top-level address is being processed, this
contains the same value as local_part. However, if an address generated by
an alias, forward, or filter file is being processed, this variable
contains the local part of the original address.

primary_hostname: The value set in the configuration file, or read by the
uname() function.

received_protocol: When a message is being processed, this variable con-
tains the protocol by which it was received.

recipients_count: When a message is being processed, this variable contains
the number of envelope recipients that came with the message. Duplicates
are not excluded from the count.

reply_address: When a message is being processed, this variable contains
the contents of the Reply-to: header if one exists, or otherwise the
contents of the From: header.

return_path: When a message is being delivered, this variable contains the
return path - the sender field that is sent as part of the envelope. In
many cases, this has the same value as sender_address, but if, for example,
an incoming message to a mailing list has been expanded by a director which
specifies a specific address for delivery error messages, then return_path
contains the new errors address, while sender_address contains the original
sender address that was received with the message.

sender_address: When a message is being processed, this variable contains
the sender's address that was received in the message's envelope.

sender_address_domain: The domain portion of sender_address.

sender_address_local_part: The local part portion of sender_address.

sender_fullhost: When a message has been received from a remote host, this
variable contains the host name and IP address, as a concatenated string,
with the IP address in square brackets. In the case of incoming SMTP
messages, the host name is the data receceived in the HELO or EHLO command.

sender_host_address: When a message has been received from a remote host,
this variable contains the host's IP address.

sender_host_name: When a message has been received from a remote host, this
variable contains the host's name (from the HELO or EHLO command, in the
case of SMTP).

sender_ident: When a message has been received from a remote host, this
variable contains the identification received in response to an RFC 1413
request. When a message has been received locally, this variable contains
the login name of the user that called Exim.

tod_bsdinbox: The time of day and date, in the format required for BSD-
style mailbox files, for example: Thu Oct 17 17:14:09 1995.

tod_full: A full version of the time and date, for example: Wed, 18 Oct
1995 09:51:40 +0100. The timezone is always given as a numerical offset
from GMT.

tod_log: The time and date in the format used for writing Exim's log files,
which is: 1995-10-12 15:32:29.

value: This variable contains the result of an expansion lookup operation,
as described above. If used in other circumstances, its contents are null.

version_number: The version number of Exim.

