[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15. Searching Mail Folders

Af has search commands, which allow you to search through the messages in a buffer and find those which match a regular expression. You can also search for messages which match a tag expression.

15.1 Searching for Regular Expressions  Searching for regular expressions.
15.2 Searching for Tagged Messages  Searching for tagged messages.
15.3 Tagging Matching Messages  Tag all messages which match a regex.
15.4 Syntax of Regular Expressions  The syntax of regular expressions.
15.5 Searching and Case  Should case be ignored while searching?


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.1 Searching for Regular Expressions

C-s regex RET
Search for regex (search-forward).
C-r regex RET
Search backward for regex (search-backward).

To do a search on a buffer (whether typeout or a mail buffer), use C-s or C-r. Af will prompt you for the regular expression to search for, and then the search takes place. If no messages match the regular expression then the search will fail with an error.

A second search immediately after the first, will not match the current message, so repeated searches will move through all the messages which match the regular expression. To make this more convenient, the search expression is defaulted to the last one you entered.

The search commands with a numeric argument will only search the headers of the messages. This is often convenient when (for example) looking for messages which are from a particular person.

With a negative numeric argument the search commands will only search the bodies of the messages. This can be useful when (for example) looking for messages which mention your machine's host name (which is included in the headers of all messages).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.2 Searching for Tagged Messages

C-t C-s tagexpr RET
Search for messages matching tagexpr (tag-search-forward).
C-t C-r tagexpr RET
Search backward for messages matching tagexpr (tag-search-backward). 

To search for a message matching a tag expression use C-t C-s or C-t C-r. Af will prompt you for the tag expression (see section 13.1.4 Tag Expressions) to search for; and then the search takes place. If no messages match the tag expression then the search will fail with an error.

Just as with regular expression searches, a second search immediately after the first, will not match the current message, so repeated searches will move through all the messages which match the tag expression. To make this more convenient, the search expression is defaulted to the last one you entered.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.3 Tagging Matching Messages

As well as just searching for a regular expression and moving point to the first matching message, af can tag all the messages which match a regular expression. To do this use C-t s (search-and-tag). You will be prompted for the regular expression to search for, and the tags to set on the matching messages (see section 13.2 Setting and Removing Tags). Once the search has finished, af will report how many messages were tagged.

With a numeric argument this command will only search the headers of the messages. This is often convenient when (for example) looking for messages which are from a particular person or mailing list.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.4 Syntax of Regular Expressions

Regular expressions have a syntax in which a few characters are special constructs and the rest are ordinary. An ordinary character is a simple regular expression which matches that same character and nothing else. The special characters are `$', `^', `.', `*', `+', `?', `[', `]' and `\'. Any other character appearing in a regular expression is ordinary, unless a `\' precedes it.

For example, `f' is not a special character, so it is ordinary, and therefore `f' is a regular expression that matches the string `f' and no other string. (It does not match the string `ff'.) Likewise, `o' is a regular expression that matches only `o'. (When case distinctions are being ignored, these regular expressions also match `F' and `O', but we consider this a generalization of "the same string", rather than an exception.)

Any two regular expressions a and b can be concatenated. The result is a regular expression which matches a string if a matches some amount of the beginning of that string and b matches the rest of the string.

As a simple example, we can concatenate the regular expressions `f' and `o' to get the regular expression `fo', which matches only the string `fo'. Still trivial. To do something nontrivial, you need to use one of the special characters. Here is a list of them.

. (Full stop)
is a special character that matches any single character except a newline. Using concatenation, we can make regular expressions like `a.b' which matches any three-character string which begins with `a' and ends with `b'.

*
is not a construct by itself; it is a postfix operator, which means to match the preceding regular expression repetitively as many times as possible. Thus, `o*' matches any number of `o's (including no `o's).

`*' always applies to the smallest possible preceding expression. Thus, `fo*' has a repeating `o', not a repeating `fo'. It matches `f', `fo', `foo', and so on.

+
is a postfix character, similar to `*' except that it must match the preceding expression at least once. So, for example, `ca+r' matches the strings `car' and `caaaar' but not the string `cr', whereas `ca*r' matches all three strings.

?
is a postfix character, similar to `*' except that it can match the preceding expression either once or not at all. For example, `ca?r' matches `car' or `cr'; nothing else.

[ ... ]
is a character set, which begins with `[' and is terminated by `]'. In the simplest case, the characters between the two brackets are what this set can match.

Thus, `[ad]' matches either one `a' or one `d', and `[ad]*' matches any string composed of just `a's and `d's (including the empty string), from which it follows that `c[ad]*r' matches `cr', `car', `cdr', `caddaar', etc.

You can also include character ranges in a character set, by writing two characters with a `-' between them. Thus, `[a-z]' matches any lower-case letter. Ranges may be intermixed freely with individual characters, as in `[a-z$%.]', which matches any lower case letter or `$', `%' or `.'.

Note that the usual regex special characters are not special inside a character set. A completely different set of special characters exists inside character sets: `]', `-' and `^'.

To include a `]' in a character set, you must make it the first character. For example, `[]a]' matches `]' or `a'. To include a `-', write `-' as the first or last character of the set. Thus, `[]-]' matches both `]' and `-'.

To include `^', make it other than the first character in the set.

[^ ... ]
`[^' begins a complemented character set, which matches any character except the ones specified. Thus, `[^a-z0-9A-Z]' matches all characters except letters and digits.

`^' is not special in a character set unless it is the first character. The character following the `^' is treated as if it were first (`-' and `]' are not special there).

^
is a special character that matches the empty string, but only at the beginning of a line in the text being matched. Otherwise it fails to match anything. Thus, `^foo' matches a `foo' which occurs at the beginning of a line.

$
is similar to `^' but matches only at the end of a line. Thus, `xx*$' matches a string of one `x' or more at the end of a line.

\
has two functions: it quotes the special characters (including `\'), and it introduces additional special constructs.

Because `\' quotes special characters, `\$' is a regular expression which matches only `$', and `\[' is a regular expression which matches only `[', etc.

For the most part, `\' followed by any character matches only that character. However, there are several exceptions: two-character sequences starting with `\' which have special meanings. The second character in the sequence is always an ordinary character on its own. Here is a table of ``\'' constructs.

\{ ... \}
is a postfix construct, similar to `*' except that it allows you to specify the number of times the preceding expression must be matched. So, for example, `ca\{3\}r' will match only the string `caaar'.

If you add a comma after the number of times the expression must be matched, then the expression must be matched at least as many times as you specified. So `ca\{2,\}r' will match the strings `caar', `caaar', `caaaar', and so on.

You can also add a maximum value after the comma, to specify a range of values. So `ca\{1,3\}r' will match only the strings `car', `caar' and `caaar'.

\|
specifies an alternative. Two regular expressions a and b with `\|' in between form an expression that matches anything that either a or b matches.

Thus, `foo\|bar' matches either `foo' or `bar' but no other string.

`\|' applies to the largest possible surrounding expressions. Only a surrounding `\( ... \)' grouping can limit the scope of `\|'.

\( ... \)
is a grouping construct that serves three purposes:

  1. To enclose a set of `\|' alternatives for other operations. Thus, `\(foo\|bar\)x' matches either `foox' or `barx'.

  2. To enclose a complicated expression for the postfix operators `*', `+' and `?' to operate on. Thus, `ba\(na\)*' matches `bananana', etc., with any (zero or more) number of `na' strings.

  3. To mark a matched substring for later reference with `\N'.

\N
When you use `\( ... \)' in an expression, you can look for another match for the exact same text that was matched inside the `\( ... \)'. The two-character sequence `\N' will match the same text as was matched by the Nth use of `\( ... \)'. The first nine uses are remembered, and are assigned the numbers `1' to `9'. So `\1' matches the text that was matched by the first use of `\( ... \)'.

For example, `\([a-z]\)\1' matches any two consecutive lower case characters. The `\([a-z]\)' matches any lower case character, while the `\1' must match the same character.

If a use of `\( ... \)' matches more than once, which often happens if it is followed by `*' or `+', only the last match is stored for use with `\N'.

Here is a moderately complicated regex, which you might use to find messages from the af-bug or af-user mailing lists.

 
^From:.*af-\(bug\|user\)@csv.warwick.ac.uk


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.5 Searching and Case

Searches in af normally ignore the case of the text they are searching through. Thus, if you specify searching for `foo', then `Foo' and `foo' are also considered a match. Regular expressions, and in particular character sets, are included: `[ab]' would match `a' or `A' or `b' or `B'.

If you set the variable case-fold-search to false, then all letters must match exactly, including case.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Malc Arnold on August, 22 2002 using texi2html