[Next]

Table of contents ftp://cs.uta.fi/pub/ssjaaa/pm-tips.html

1.0 Document id
2.0 UBE in Internet
3.0 Anti-UBE pointers
4.0 Procmail pointers
5.0 Dry run testing
6.0 Things to remember
7.0 Procmail flags
8.0 Matching and regexps
9.0 Variables
10.0 Suggestions and miscellaneous
11.0 Scoring
12.0 Formail usage
13.0 Procmail, MIME and HTML
14.0 Simple recipe examples
15.0 Miscellaneous recipes
16.0 Procmail and PGP
17.0 Includerc usage
18.0 Mailing list server
19.0 Common troubles
20.0 Implementation details
21.0 Technical matters
22.0 Different version features and bugs
23.0 Smartlist
24.0 Additional procmail or MUA software
25.0 Additional procmail software for Emacs
26.0 Procmail, Emacs and Gnus
27.0 RFC, Request for comments
28.0 Introduction to E-mail Headers
29.0 Message's headers
30.0 MIME tags
31.0 Jokes
32.0 Other Code
End

1.0 Document id [toc]

1.1 General [toc]

@(#) $Id: pm-tips.txt,v 1.31 1998/03/10 08:29:52 jaalto Exp $
$Docid: 1998-03-10 Jari Aalto $
$Contactid: jari.aalto@poboxes.com $
$URL: http://www.netforward.com/poboxes/?jari.aalto $
$FileServer: send mail to Contactid with subject "send help" $
$Keywords: procmail sendmail formail mail UBE UCE spam filter $

@(#) This is a procmail tips page: a collection of procmail recipes,
@(#) instructions, howtos. The document also contains URL pointers to
@(#) the procmail mailing list and sites that fight against Internet
@(#) UBE. You will also find many other interesting subjects that
@(#) discuss about internet email: haeders, mime and RFCs.

@(#) The tips are compiled from the procmail discussion list,
@(#) from comp.mail.misc and from the author's own experiences with
@(#) procmail. The document is actively maintained and new sections
@(#) appear every 2 or 3 weeks.

This document does not intend to teach you the basics of procmail; instead you have to be familiar with the procmail man pages already before this document's tips are useful to you. Please prefer reading Nancy's and Era's procmail faq pages too before this page. Especially Era's link page is a exellent collection of usefull procmail links and unix programs that deal with email (eg. MHonArc Email hyperarchiver). If you find errors or things to improve from this document, please go ahead and send mail to [jari].

If you add link to the absolute address of this site/page, please also mention/include this link:

      http://www.netforward.com/poboxes/?jari.aalto

The above permanent www link points to main from which this document canb be accessed even if this account ceases to exist or of this page is renamed or moved to another site.

If you want to have automatic notification whenever page changes, please visit below and register this page to to your reminder list:

      http://www.netmind.com/URL-minder/new/register.html

To get nicely formatted netmind messages, see procmail module pm-janetmind.rc

1.2 Abbreviations and thanks [toc]

People and documents referred to, in no particular order.

[stephen] Stephen R. van den Berg, Author of Procmail
[alan] Alan K. Stebbens aks@anywhere.engr.sgi.com
[david] David W. Tamkin dattier@wwa.com
[phil] Philip Guenther guenther@gac.edu
[elijah] Eli the Bearded process@qz.little-neck.ny.us
[aaron] Aaron Schrab aaron+procmail@schrab.com
[dan] Daniel Smith dan@bristol.com
[hal] Hal Wine hal@dtor.com
[sean] Sean B. Straw PSE-L@mail.professional.org
[ed] Edward J. Sabol sabol@alderaan.gsfc.nasa.gov
[jari] Jari Aalto jari.aalto@poboxes.com
[faq] Procmail FAQ j1era+pr@iki.fi
[manual] Quote from some procmail manual page

I also thank following people

1.3 Version information [toc]

Here is version and file size log of the text file, which gives you some estimate how often you should update your copy.
      v1.01   1997-09-13  46K
      v1.05   1997-09-14  53K
      v1.5    1997-09-16  76K
      v1.6    1997-09-18  94K
      v1.8    1997-10-01  127K
      v1.9    1997-10-11  142K
      v1.10   1997-10-13  181K  archive file 1995-10's tips included
      v1.13   1997-11-08  218K  Era's correction suggestions.
      v1.14   1997-11-25  260K
      v1.17   1997-12-09  343K  up till achive 1996-07 now included
      v1.24   1997-12-30  415K  up till 1996-12 is now included
      v1.29   1998-01-30  429K  "regexp" section rewrite.
      v1.31   1998-03-10  469K  Better ordering: ORing rules discussed

Please also familiarise yourself to unix what(1) and GNU RCS ident(1), if you have those commands in your system. It is important that you mark interesting text to these tools so that someone can get an overview of your supplied procmail rc files

      % what FILES        - Print @( # ) tags
      % ident FILES       - Print $ $ keywords

1.4 Document layout [toc]

This document is maintained in plain text format with Emacs and text formatting package tinytf.el (automatic TOC and indentation control). If you see some funny marks or indentation in the text version, they are there for purpose so that Perl text to html filter t2html.pls can be used. All the tools can be found from ftp://cs.uta.fi/pub/ssjaaa/

1.5 About presented recipes [toc]

The recipes presented here are collected from the net and procmail archives. I have tried my best to keep the recipes as original as possible, but I have also generalised some examples a little. If some recipe doesn't work as announced, please a) send note to [jari] b) send email to procmail mailing list and ask how to correct it. I will watch the procmail list and I'll replace any faulty recipe with correct one.

I have taken the liberty to use dot(.) sometimes in regular expressions, where the right, pedantic way would have been to use escaped dot. If you want to be very strict, you should use the escaped dot where applicable. Usually there is no mishits; if you used escaped dot or not. Consider following. I like the less "line noise" version more.

      [free hand version]     [pedantic version]
      :0                      :0
      * match.this.site       * match\.this\.site

Procmail also accepts that assignments can be be done without quotes, like this:

      var = value
      num = 1
      dir = /var/mail

But I have adopted a style, where literal strings are assigned with double quotes:

      var = "value"

because the procmail code checker then won't warn you about missing dollar, which you might have very well fogotten. For the numeric values or directories, there is no misinterpretations, so the quotes are not essential.

      #   If you do this...
      var = value
      #   then it's not certain what was intended.
      #   In this doc only these formats are used
      var = "value"   # literal assignment
      var = $value    # another variable assignment

1.6 Variables used in recipes [toc]

These are part of the procmail module pm-javar.rc and used in recipes
      # Pure newline; typical usage: LOG = "$NL message $NL"
      #
      NL = "
      "

Refer to "improving Space-Tab syndrome" section for more details

      WSPC    = "     "               # whitespace: space + tab
      SPC     = "[$WSPC]"             # Regexp: space + tab
      NSPC    = "[^$WSPC]"            # negation
      WSPCL   = "( |  |$)"            # whitespace + linefeed: spc/tab/nl
      s       = $SPC                  # shortname: like perl -- \s
      d       = "[0-9]"               # A digit -- Perl \d
      w       = "[0-9a-z_A-Z]"        # A word  -- Perl \w
      W       = "[^0-9a-z_A-Z]"       # A word  -- Perl \W
      a       = "[a-zA-Z]"            # A word, only alphabets

Writing recipes is now a little easier and may look more clear.

      :0
      *$ Header:$s+$d+$s+$d    # Matches "Header: 11 12"

PMSRC = Procmail source directory. Anywhere you want it to be: usually $HOME/pm or $HOME/.procmail. Here you keep the procmail files, logfiles and includerc scripts. You can also use synonym PMDIR.

SPOOL = Directory where your procmail delivers the categorized messages. Like mailing lists: list.procmail, list.lyx-users, list.emacs, list.elm, and work mail: work.announcements, work.lab and your private message: junk.daemon, junk.cron, mail.usenet, mail.fi.local. If you read the procmail delivered files directly, this directory is usually $HOME/Mail or $HOME/mail. If you use some other software that reads these files as mail spool files (like Emacs Gnus), then this directory is typically ~/Mail/spool. MY_XLOOP = Used to prevent resending messages that have already been handled. Typically "$LOGNAME@$HOST", but this can be any user chosen string.

1.7 About "useless use of cat award" [toc]

Randal Schwartz, a well known Perl programmer and Perl book writer, started giving emmy rewards for the "useless use of cat command" whenever someone wrote examples without token "<". Like this:
      % cat file.name.this | wc -l

Instead he insisted that the call must have been written like this, which saves the pipe. (Never mind that wc can read the file directly; this is an example.)

      % wc -l < file.name.this

I stick my opinion in this soup and you're free to disagree. When you see the shell commands used in this document, they are written so that they can be read left to right: The "<" is to my opinion evil to undertand. See this:

      % cmd1 < file1 | cmd2 > file2

Arrggh I say: Yes, I know what it means but we are not forced to use that convention. This is much more readable IMO.

      % cat file1 | cmd1 | cmd2  > file2

And now to the purist side: Is saving one pipe process so important? Lemme see, I have 2Meg file in this test:

      % time sh -c "cat some-file-name-is-here | time wc -l"
      0.29u 0.11s 0:00.47 85.1%
      % time sh -c "wc -l < some-file-name-is-here"
      0.27u 0.05s 0:00.39 82.0%

Gee, there is not much diffrence, and this 2Meg file is not typical at all. The files used are many times smaller. The nitpicking is therefore pointless. There is one more point why I use "left to right pipe writing": when you recal the command in csh, you can edit the last command's argument easily. If you used the "<" token, tapping keyboard is much more tedious (try changind wc command's option above). Oh yeah, you can write like this to get the command to the right, but that's even more obscure.

      % < some-file-name-is-here wc -l

Ahem, so there, I got it off my chest...

1.8 Sending improvements [toc]

Because I'm not English speaking, I regret the bad language I may have used in this document. If you have any time, 5-10 minutes to find some spelling mistake or misuse of the English verbs, please go ahead and send me patch to correct the wording. The preferred way to send corrections to this document is diff(1) format. Here is how you make the corrections and pass them to me.

The diff option -u is only available in GNU diff, please try to send the -u diff if possible. If you don't have -u option in you diff command, use -c instead.

      %   cp pm-tips.txt pm-tips.txt.orig
      ...load the pm-tips.txt to your text editor
      ...edit the file and save
      .. Print the version number first
      %   what pm-tips.txt    > m-tips.txt.diff  # see man what(1)
      %   diff -u -bw pm-tips.txt.orig pm-tips.txt >> pm-tips.txt.diff
      ...Send the mail
      %  cat pm-tips.txt.diff | mail jari.aalto@poboxes.com

1.9 What is Procmail? [toc]

[faq] Procmail is a mail processing utility, which can help you filter your mail; sort incoming mail according to sender, Subject line, length of message, keywords in the message, etc; implement an ftp-by-mail server, and much more. Procmail is also a complete drop-in replacement for your MDA. (If this doesn't mean anything to you, you don't want to know.)

Procmail runs under Unix. See Infinite Ink's Mail Filtering and Robots page for information about related utilities for various other platforms, and competing Unix programs, too (there aren't that many of either).


2.0 UBE in Internet [toc]

2.1 Terms used and foreword [toc]

[Part of this have been excerpted from the Email Abuse Faq]

UBE = Unsolicited Bulk Email.
UCE = (subset of UBE) Unsolicited Commercial Email.

Spam = Spam describes a particular kind of Usenet posting (and canned spiced ham), but is now often used to describe many kinds of inappropriate activities, including some email-related events. It is technically incorrect to use "spam" to describe email abuse, although attempting to correct the practice would amount to tilting at windmills.

Spam = definition by Erik Beckjord. "Some people decide that Spam is anything you decide you want to ban if you can't handle the intellectual load on a list." Remember, Not to be confused with real spam, which is unwanted bulk mail.

The thing where people are nowadays seeking for cure is stopping and handling UBE. And that can be easily done with procmail (by you) and and with sendmail by your sysadm. In order to select the right stragegy against UBE messages, you should read this section and then decide how you will be using your procmail to deal with it.

2.2 UBE strategies [toc]

[Excerpted from the Email Abuse Faq]

4g. I asked to be "removed" - guess what? I got another U*E

Not surprisingly, many UBE outfits treat a "remove" request as evidence that the address is "live"; a "remove" request to some bulk emailers will actually guarantee that they will send more to you. For many others, the remove procedure does not work, either by chance or design. At this point perhaps you're starting to get a feel for the type of people with whom you are dealing.

Also, getting removed doesn't keep you from being added the next time they mine for addresses, nor will it get you off other copies of the list that have been sold or traded to others. In summary, there is no evidence of "remove" requests being an effective way to stop UBE.

4h. I asked to be "removed" - guess what? The message bounced

Probably the remove procedure was false. Any remove procedure that tells you to send remove requests to AOL, CompuServe, Prodigy, Hotmail, or Juno is certainly false. The bulk emailers are an unpopular lot; they forge headers, inject messages into open SMTP ports, use temporary accounts, and pull other stunts to avoid the tirade of complaints that follow every mailing.

2.3 UBE and bouncing message back [toc]

Has anyone found that bouncing spam does any good at all?

[sean] I had a whole policy message written up that would be sent out to spammers. Nothing but a waste of my resources. Most return paths are either completely bogus, or end up bouncing pretty damn soon after the spam, which just brings you more junk to deal with.

Instead, I choose to send messages occasionally to administrators and upline providers of domains which spew. "Agreement by action" is one of the legal standards I like to use (for "should you continue to send mail to me, that constitutes acceptance of the terms herein").

InterNIC recently 1997-07 removed the root files for .com, .org, and .net (I think) from access at their ftp server. Too many spammers were using them for the purpose of generating mailing lists. Access to the files now requires an assigned FTP account from InterNIC. When I get a domain-style spam, I immediatley do a whois to get DNS info on the domain, then grep the root files to obtain a list of domains serviced by the same DNS. If they appear spammy (as spam domains tend to), I add these to a list of domains to filter (egrep) in my primary domain-based ruleset. Works for me, though the list is getting big.

[Kimmo Jaskari kimmo@alcom.aland.fi] Another good reason is that all those bounces, which get ignored by the spammer/recipient anyway, still take up needless bandwith on the net. The spam is bad enough for that, bouncing it back with some more stuff added is just plain silly. You become part of the problem rather than the solution. If the bounce even gets to the spammer, the spammer drops it on the floor unseen.

2.4 UBE and "I don't mind" attitude [toc]

...whenever you see a spam you don't want, hit the delete key and move on. Grow up and get a life, folks. The spams just don't bother me. Why the hell does everyone have to go up in arms everytime someone sends a spam? Spams are harmless! Spams even sometime are interesting and/or useful!

[Responses from thread in procmail mailing list 1995-10 to "FREE 1 yr. Magazine" spam.]

[Soren Dayton csdayton@midway.uchicago.edu]

The simplest reason against UBE is that it is rude. It costs some people money to get email on some commercial services. This is fundamentally different than junk snail mail for this reason and too much spam can prevent people from getting mail (mailboxes can fill up). So it is both an intrusion into my life and it can coneivably end in me either loosing money or loosing mail (which is far more important). It is a burden on the receiver far beyond just hitting the delete key.

[Mark Seiden mis@seiden.com]

people who are able to monitor the incoming machines of one of the larger online services (like me) can see a sizeable increase in system load average and volume directly resulting from spams. this competition for fixed resources inevitably translates to reduced service for "first class" mail.

It is impossible to engineer a mail system that can cope with an unlimited amount of abuse. this is in addition to the difficulties of doing so on a fixed price economic model, and the difficulties of keeping up with the successful rapid expansion of the population to be served.

Even if you, an individual, aren't charged anything per piece of mail, there are costs borne by your service provider per piece of mail, and these are somehow passed on to you. (They've calculated an average across their entire user population to come up with a "monthly cost of Internet mail".)

Spamsters and bulk mailers are not at all concerned about efficiency. as proof of that, many of them are not even courteous enough to supply a proper return address, so they can prune their lists of undeliverable mail. all they care about is getting their message across without their paying anything whatsoever for that service.

Watch how this will inevitably translate into increased costs for you, the consumer, unless we change the mechanisms by which bulk mail is delivered as well as putting an appropriate economic model in place.

[Steve Simmons scs@lokkur.dexter.mi.us]

If you tolerate spamming, it will only get worse. Spamming has been stopped again and again. Almost without exception, the spammers have been tracked down and, via one means or another, have been convinced to stop spamming.

Spams are harmless? I've already seen the 'Magazine Sub' message 10 or 12 times. I have a low bandwidth line. If I continue to tolerate spamming, I will pay a very real penalty in performance as tens, then thousands of spammers do it. Not to mention the personal time involved in taking care of the crap.

Don't think that the time involved is signficant? Just wait. My wife and I are fairly generous with our time and money. As a result, we were getting an average of five telephone calls *per night* asking for money for various causes. A year ago, I adopted a new policy -- I will not under any circustances give money to a caller, and will only consider it upon written solicitiation. I ask them to put me on their `do not call list'. If they do *anything else* to continue the conversation, I hang up on them.

My wife opposed this, and we agreed to disagree -- if they ask for her, they get her. If they ask for me, they get my speech. After a year, she is getting 2-3 calls per night and I'm getting one or two a week.

My point here is that individual action does get re-action from the mailers. For them, I copy their internet providers on my complaints and call their Better Business Bureau. It works.

If one does this politely and consistantly, 98% of the spammers will stop. The remaining 2% will discover that they're in a different world from direct mail or telephone solicitation. Their mailboxes will be overloaded with complaints (when it takes a single keystroke to invoke your complain macro, you're very likely to complain). Then their suppliers mailboxes will be overloaded with complaints. The free magazine folks, who've been hiding behind false ids and forging mail, will find that they're on the wrong side of the law. I'm considering contacting their local legal officials and urging them to investigate, because it sure looks like fraud to me (read `Consumer Reports' for a similar case by surface mail). Should a few more like this come in, I will contact their legal authorities. We have their fax number; it's all we need to find them.

[Carl Payne cpayne@optical.fiber.net]

Um, I don't know about you or anyone else here, but this cutesy, "it's-okay-by-me" spam has been circulated under half a dozen different user names and "domains" on as many mailing lists. It's obvious to me the sender is trying to make people pissed off--how can he possibly think someone will buy that crap, and why does he think it's okay to send 19 and 20K files over a billion groups?

AFAIC, it has to stop. Now. I'm tired of the spam, I'm tired of the "Who cares" attitude about spam, I'm tired of ISPs letting people spam, I'm tired of the jetwash of spam, and I'm tired of the bleedinghearts that say, "Golly, just ignore it, and it'll go away."

I've got news for you all: when this method of spamming becomes the preferred method of "marketing" on the internet, and people like us are the bad guys because we're not allowing such litter to fly across the fiber, you will care. You will say something, most probably, "Why didn't we do something about this sooner?"

The guy in the next cube from you, who's paying a per-message charge through his ISP, is probably going, "Dammit, over three dollars this month on mail I've itemized as being spam." While that doesn't seem like a lot, I revert to my earlier statement: if this becomes the preferred method, his bill (and yours) will go up, and everyone will wonder why it's too out of control to do anything about.

Spam has the letters m-a-s in it, which en Espanol, means "more." I say no. Not only no, but hell no. And, I refuse to be told that my thinking is out of line just because I don't want my mailbox flooded. Do something now. Do anything now. But, don't be quiet and listen to anything that sounds like an endorsement of litter

2.5 Is one or two UBE messages acceptable [toc]

Ray Everett-Church, Attorney/Online Consultant Co-Founder & Congressional Liaison <http://www.everett.org> Coalition Against Unsolicited Commercial Email; article 1997-12 in remailer politics mailing list

In developing what eventually became the Smith Bill, CAUCE discussed this rather extensively among our drafting committee. The bill gives a cause of action againts the advertiser, not any of the pathways taken between you and them. This is consistent with the interpretation of the fax law (and many other laws for that matter) wherein the advertiser -- not the advertiser's agent -- is responsible for the act committed.

As for the single UCE versus bulk issue, the general concensus has been that while a single piece of spam does not do much damage, it is fundamentally no less a cost shift than 10 identical messages, or 100, or 1000, or a million. The only difference is that the costs being shifted are greater and greater. We discussed many cut off points... would 50 spams be acceptable? 25? 10? One really well crafted, hand written, heartfelt and personalized spam be permissable? And in the end we felt like we were discussion angels on the heads of pins.

While virtually nobody's system will crash because of one piece of spam (although George Nemeyer had trouble with three or four pieces as I recall), what is the ultimate difference if you only get one piece from each of 15 different advertisers a day? If one spam is ok, but two are bad, what is the interval... a day, a week? Enforcement depends on knowing when the threshold is crossed.

So here's a scenario: you receive three spams from what is, unbeknownst to you, the same person (one advertising weightloss pills from WeightLoss Associates at PO Box 1, one for an MLM from MLM Company at PO Box 2, and Bee Pollen from Pollen Partnership at PO Box 3). Each were individually crafted and appeared to be mailed only to you.

Under the scenario above, if the law permits one spam, will you sue?

Would you risk suing one or all of them, gambling that they sent the spam to anyone other than you (or whatever the threshold is... 10, 25, 50)? Would you risk suing one or all of them on the chance that they were somehow related? What if there was a chance that you'd find out that the three companies were really different? What if you did sue and found that they were owned by the same person, but were legally organized seperate entities and were therefore each entitled to one spam a piece?

In short... if one spam is permitted, it could make enforcement incredibly cumbersome, difficult and unlikely, and would present spammers with many reasons to violate the law knowing the odds of a suit and successful enforcement are greatly reduced. While bulk spam is really bad on many levels, whether it's parsed out in very small volumes makes little or no difference to the ultimate recipients as far as the diminished utility, cost, and annoyance.

We need a clear, bright line. And the Smith Bill is that.


3.0 Anti-UBE pointers [toc]

3.1 NoCEM, CAUCE and others [toc]

"NoCEM" http://www.cm.org/

"Dougal's NoCeM-E" http://advicom.net/~dougal/antispam/ ... Dougal is sysadm for an ISP. His page has wealth of information about Anti-SPAM Tools. You also find his mailing list for NoCeM-E.

"The Coalition Against Unsolicited Commercial Email (CAUCE)" http://www.cauce.org/faq.html ...The Problem: Unsolicited commercial email, more commonly known as "spam", is a growing problem on the Internet. If you've used the Internet for any length of time, you've probably received solicitations via email to purchase products or services.

A Solution: A group of Internet users who are fed up with spam have formed a coalition whose purpose is to amend 47 USC 227, the section of U.S. law that bans "junk faxing", so that it will cover electronic mail as well.

"Teergrubing against Spam" http://www.iks-jena.de/mitarb/lutz/usenet/teergrube.en.html ...`Teergrubing' It's German and means Tar-Pit. Once you have been stuck you can't get out. ...slow down internet connections in order to stop UBE abuse. Several hundred teergrubes are able to block spamming worldwide without blocking any e-mail. How do I start: If you are the admin of a MX host, install a teergrube.

"Lot of good articles about spam" http://www.sun.com/sunworldonline/swol-12-1997/swol-12-spam.html

"Senator Torricelli's el statement on junk e-mail" http://www.sun.com/sunworldonline/swol-08-1997/swol-08-junkemail.html ...considerable variation in the approaches at the federal level, and state legislation varies widely as well. Professor David Sorkin of John Marshall Law School, who summarized and provided links to the major spam-related lawsuits noted above, also provides status summaries and links to state and federal legislation

"Select email court cases -- Lots of them" http://www.jmls.edu/cyber/cases/spam.html America Online, Inc. v. Cyber Promotions, Inc., Compuserve Inc. v. Cyber Promotions, Inc., etc.

3.2 General Filtering pages (more than procmail) [toc]

"Nancy McGough's Mail Filtering FAQ" http://ssil.uoregon.edu/~trenton/autopage/page7547.html http://www.ii.com/internet/faqs/launchers/mail/filtering-faq/

"Information Filtering Resources" http://www.ee.umd.edu/medlab/filter/ Doug Oard oard@glue.umd.edu ...This page lists all known internet-accessible information filtering resources.

3.3 Junk email and spam [toc]

"Spam FAQ" ftp://rtfm.mit.edu/pub/usenet/alt.spam/ http://www.cs.ruu.nl/wais/html/na-dir/net-abuse-faq/spam-faq.html

"The email abuse FAQ" http://members.aol.com/emailfaq/emailfaq.html What is UBE, UCE, EMP, MMF, MLM, Spam, it is all explained here.

"Against Spam -- The garbage collecting." http://www.spam-archive.org/ To support this archive please forward email spam to spam-list@toby.han.de. Everybody is invited to bounce Mail-Spam he/she has got to this list. This is a mailinglist to distribute actual spam-eMail. All incoming mail will be checked by subject and from/sender-address wether it has already been distributed or not. No discussions in this list. To discuss about this list please subscribe to spam-list-d@hiss.org.

To subscribe to blacklist-update mailing list TO: Majordomo@hiss.han.de BODY: subscribe blacklist-update you@somewhere.com Mail postmaster@spam-archive.org to discuss about blacklist if your name is on it. (maintained by Axel Zinser fifi@sis.han.de) Get the updated blacklist from ftp://ftp.spam-archive.org/spam/blacklist/

"Doug Muth Page" http://bounce.to/dmuth ... "The SPAM-L FAQ" - A FAQ for SPAM-L, an anti-spam mailing list. This FAQ discusses how to join the list and what to post there, AND it also delves into the technical aspects of spam. For instance, the various kinds of forgeries seen in spams are discussed here, along with information on how to recognise them. If you hate spam, this is something worth checking out... "TheGoodsites List" - I maintain this list, which is part of the Spam Boycott, to show which Internet providers out there act responsibly when dealing with spam. If you're looking for an ISP and want to know where they stand on spam, this is the list for you.

Send an email message to LISTSERV@peach.ease.lsoft.com with the words "subscribe SPAM-L <First name> <Last name>" in the body of the message (no quotes). f you would like to contact the owner, the convention is the same as with all LISTSERV lists. Just send e-mail to spam-l-request@peach.ease.lsoft.com

"Dealing with Junk Email" ...What you should do (and not do) when you have been victimized by a junk emailer. This document teaches you how to read headers in order to trace the origin of junk email, and includes detailed examples to show you how it is done. Headers are designed for computers to read, not people, so they can be a little hard to follow. Therefore, I hereby grant permission to print or electronically save a copy of this page on your local machine for your personal use while tracing junk email. Please check back for updates and corrections, though.

http://www.mcs.com/~jcr/junkemaildeal.html

"How to fight back." http://www.oeonline.com/~edog/spamstop.html

  1. Look at the header of the advertising message. Find the "Message-ID" line. (You might have to tell your e-mail program to display this.)
  2. The words after the @ sign are the sender's real--not faked--Internet Service Provider, or ISP. (Spammers often try to disguise their address, but the Message-ID is a good clue.)
  3. Write a complaint to the postmaster of that ISP, similar to the one below. (If the ISP is junkmail.com, then let postmaster@junkmail.com hear from you.)

    "Practical Tools to Boycott Spam" ...We have been actively engaged in fighting spam for years. Recent events, including pending court battles, prompt us to present this page to the public. Fight spam to keep the Internet useful for everyone. We have been actively engaged in fighting spam for years. Recent events, including pending court battles, prompt us to present this page to the public

    • Filtering mail to your personal account
    • Blocking spam email for an entire site
    • Blocking Usenet spam for an entire site
    • Blocking IP connectivity from spam sites
    • Other tools and techniques for limiting spam
    • Sample Acceptable Use Policy statements for ISPs

    http://spam.abuse.net/spam/

    "Spam -- stop that!" http://www.accessnt.com.au/faqs/spam.htm http://com.primenet.com/spamking/buyerbeware.html

    "The Campaign to stop junk email web site" http://www.mcs.com/~jcr/junkmail.html ...we will attempt to teach victims and potential victims (that's everyone with an email address) the most effective methods of prevention and retribution.

    "news.admin.net-abuse.* Homepage" http://www.math.uiuc.edu/~tskirvin/home/nana/

    "The automated spamhandler beta information heap." http://www.halcyon.com/natew/

    "Ian Leicht" http://www.nags.org/spamfilter.html

    http://www.junkbusters.com/ http://www.well.com/~jbremson/spam

    "Anti-Spam Provisions in Sendmail 8.8" URL: http://www.sendmail.org/antispam.html

    • Preventing relaying through your SMTP port
    • Refuse mail from selected hosts
    • Restrict mail acceptance from certain users to avoid mailbombing

    "Blocking Email" http://www.nepean.uws.edu.au/users/david/pe/blockmail.html

    • Do you or your users, receive "junk email" (aka., "spam")
    • Do you have Sendmail R8.8.5 running at your site?
    • Would you like to block known "junk email" senders' addresses?

    Now you can - and there's no need to patch any source code, either. Take advantage of Sendmail's check_mail rule, to see if the sender's address is a member of a nominated "class" - drawn from the contents of the named file. Additional information and links:

    • Prospective Addresses/Domains to Block
    • Limiting Unsolicited Commercial Email
    • EFF "Net Abuse and Spamming" Archive
    • [U.S.] Court Lets AOL Block Email
    • Anti-Spam HOWTO
    • Net Abuse FAQ
    • Figuring out Fake Email & Posts
    • Fight Unwanted Email
    • Unsolicited Junk Email - Bad for Business
    • Fight Unsolicited Email and Mailing
    • Yahoo's Junk Email Resources
    • jmfilter
    • Complaints Addresses at U.S. ISPs
    • news.admin.net-abuse.* Homepage
    • Processing Mail With ProcMail
    • Panix's rc.shared ProcMail Configuration
    • ProcMail Workshop
    • Email Self Defence
    • The SPAM-L mailing list

    "How to chase Usenet spammer." http://super.zippo.com/~sputum/sputools.htm

    3.4 Misc pointers [toc]

    Is there a way to block local users from spamming other sites? Maybe somehow force sentmail to read a rc file that would maybe then grab the from field and see if the user exists on the system or not. Or run it through some sort of filters.

    [phil] You can and should do this purely in sendmail. I ended up crafting a check_from ruleset that verifies that the envelope sender address is either a) not local; b) a local user; or c) a local alias. At the time I did this mainly to force people to configure their Eudora clients so they didn't say "Return Address: yourname@gac.edu" but it also covers the outgoing bogus source address spam case. For those interested in this kinda thing I've (just) put it up for FTP:

          ftp://ftp.gac.edu/pub/guenther/
    

    "Qmail" http://pobox.com/~djb/qmail.html http://people.iqweb.de/fte/qmail/

    "Sendmail" http://www.sendmail.org/

    "Fetchmail -- old pop3 replacement" ftp://ftp.ccil.org/pub/esr/ http://www.ccil.org/~esr

    3.5 Questionable UBE stop services [toc]

    "IEMMC: Internet E-Mail Marketing Council Formed 1997-03"

    The IEMMC was formed to provide an industry wide trade association for the purpose of promoting responsible e-mail marketing, and to establish an industry standard code of procedures and ethics which will internally regulate and govern the commercial e-mail marketing industry....Under this system, all e-mail of a commercial, unsolicited nature must pass through a universal filtration system which will block the sending of any and all commercial e-mail to the address on the list. Bulk e-mailers will be required to join the organization

    Others have commented that:

    ...IEMMC is a joke. you are probably not doing yourself any favors

    ...Don't take that IEMMC seriously! Many people registered with them and got as many or even more spam as before. After all, Cyberpromo (the operator of IEMMC) knows that the registered addresses will be valid for some time, so they can use and sell this valuable list to other junk mailers.

    "Spammer blacklist" http://www.netchem.com ...remove@netchem.com Dear Sir/Madam, Your email address may be on many spammers' lists. We are compiling a remove list. Forward the original junk to list@netchem.com

    "No Junk E-Mail database" http://pages.ripco.com:8080/~glr/nojunk.html ...We will help stop unwanted email to you..the list is submitted to us, and those addresses that appear in the "do not mail" list are removed and the "cleaned" list is returned

    3.6 UBE related newwsgroups [toc]

    alt.kill.spammers alt.hackers.malicous alt.2600

    [by anonymous poster in alt.privacy.anon-server 13 Aug 1997] Proper etiquette demands you contact their ISP. However, if the ISP are not interested in helping you, you should consider a posting in alt.kill.spammers (or even alt.hackers.malicous or alt.2600) - give as many details as you can about the spammer.

    A certain spam-provider targeted the alt.hackers.malicious newsgroup. Not the most sensible thing to do. The ISPs IPs were found, their MX host was hacked. All their DNS entries was published on alt.2600 (so that everyone could add filters to ignore all mail from this company). Oh yeah, their password file also made it to the group! The ISP then posted a complaint to alt.2600, much to the enjoyment of everyone who took part. That host basically died a horrible death. I'm pretty sure that not many people are going to lose any sleep over this! I might as well mention that the ISPs complaint mentioned that their "freedom" was being abused. hehehe. I wouldn't have thought that most of these postings have expired, so I'd recommend a fleeting visit to alt.2600.

    3.7 Software, mapSoN [toc]

    Note: You can do exactly the same as below with procmail with one of the listed procmail modules: pm-jacookie.rc. See the code.

    "mapSoN (NoSpam backwards) -- The no spam utility" http://mapson.gmd.de/ ftp://ftp.gmd.de/gmd/mapson/

    Most spam filtering tools I've seen so far are based on procmail, or a similar tool, and use a list of keywords or addresses to drop unwanted junk mail. While this might be nice to filter mail from known spam domains like "cyberpromo.com", it won't catch faked headers.

    mapSoN must be installed as filter program for your incoming mail, usually by adding an appropriate entry to your $HOME/.forward file. This means that mapSoN will get all your incoming mail and it will decide whether or not to actually deliver it to your mailbox.

    1. First of all, an user defined ruleset is checked against the mail. If any keywords or patterns match, the mail will be dealt with according to your wishes. This is useful to drop some sender's mail completely, or to sort mail into different mail folders.
    2. If no rule matches the mail, mapSoN will check whether the mail is a reply to an e-mail you sent, or whether it is a reply to a USENET posting of yours. If it is, the mail will always be delivered.
    3. If no signs of a reply-mail can be found, mapSoN will check whether the sender stated in the From: header has sent you mail before. If he has, the mail will pass. If this is the first time you receive an e-mail from this address, though, mapSoN will delay the delivery of the mail and spool it in your home directory. Then it will send a short notice to the address the mail comes from, which may look like this:
            From: Peter Simons simons@petidomo.com
            To: never_mailed@me.before
            Subject: [mapSoN] Request for Confirmation
      
            mapSoN-Confirm-Cookie: <some_weird_cryptographic_cookie>
      

      The person who tried to contact you will then reply to this "request for confirmation", citing the cookie stated in the mail. When your mapSoN receives this confirmation mail, it will deliver the spooled mail into your folder. Furthermore, the address will be added to the database, so that mail from this person will pass directly in future.

      If no confirmation mail arrives within a certain time, mapSoN can either delete the spooled mails, or send them to a special folder, or whatever you prefer.

      3.8 Software: spamgard [toc]

      [similar to MapSon] ftp://ftp.netcom.com/pub/wj/wje/release/sg-howto

      ...sppamgard(tm) screens from your e-mail unsolicited bulk mail. It does this in a way that you only have to change things if you have a new person from whom you do want to receive mail; you don't have to change things every time a spamster thinks of a new trick to pull, or a new spamster comes along. And spamgard(tm) is designed so that those who aren't in your "Good Guys" list can get mail to you anyway until you put them there. The instructions for them to get mail to you are simple and newbie-tested, but will still keep out bulk mail. If you're on a mailing list you want to be on, there are provisions for accepting all mail from a set of mailing lists that you specify.

      3.9 Software: RBL lookup tool [toc]

      4 Dec 1997 4 Dec 1997, Edward S. Marshall emarshal@logic.net in prcomail mailing list.

      rblcheck is a lightweight C program for doing checks against Paul Vixie's Blackhole List. It works well in conjunction with Procmail for filtering unwanted bulk email (under QMail, for example, you can invoke it with the value of the environment variable TCPREMOTEIP). rblcheck is extremely simple:

            % rblcheck 1.2.3.4
      

      where 1.2.3.4 is the IP address you want to check.

      This is a quick note to announce the availability of a new tool for using Paul Vixie's RBL blacklist (see http://maps.vix.com/rbl/ for more information about the blacklist itself, if you don't already know). Most tools which use the blacklist block email on a site-wide basis. For many networks, this treads on both the ideals of the administration, and on the perceived freedoms of the end user.

      Personally, I don't care either way. :-)

      This tool was to fill the need I had to reject mail personally, since one of the systems I receive mail through cannot, for various political reasons, implement the available RBL filters on a site-wide basis.

      "rblcheck" is a simple tool meant to be used from procmail and other personal filtering systems under UNIX in the absense of a site-wide filter, as an alternative to imposing site-wide restrictions, or as a means of imposing restrictions on systems that cannot support the existing RBL filter patches.

      Simply put: you hand it an IP address, and it determines if the IP is in the RBL filter, providing the caller with a positive or negative response. With the package, a sample procmail recipe is provided, and examples of using it under QMail and Sendmail are given.

      http://maps.vix.com/rbl/
      http://www.isc.org/bind.html The official home page
      http://www.xnet.com/~emarshal/rblcheck/

      It is only tested under Linux 2.x and Solaris 2.5.1. Success stories, patches, questions, suggestions, and flames can be directed to me at "emarshal@logic.net".

      [Aaron Schrab aaron+procmail@schrab.com] Here is my rbl setup, but, this depends both upon the format of the Received: lines, and the way that mail passes through your mail system.

      I currently grab the IP address from the first Received: header inserted by my ISP (I'm a sysadmin at the ISP, so I have a good knowledge of how mail gets passed around internally). Here's the recipe that I use.

            # if there's a Received: header from one of these servers, it's
            # (probably) the right one
      
            BACKUPSERVER    = "([yz]\.mx\.execpc\.com)"
            VIRTSERVER      = "(vm[0-9]+\.mx\.execpc\.com)"
            LOCALSERVER     = "([abc]\.mx\.execpc\.com)"
      
            # Match a header containing:
            #   Received: <anything> [<ip address>]) by <local server>
            :0
            * $ 9876543210^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${BACKUPSERVER}
            * $ 9876543210^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${VIRTSERVER}
            * $ 9876543210^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${LOCALSERVER}
            {
              IP = $MATCH
              # trim it down to just the IP address
              #
              :0
              * IP ?? ^^\/[0-9.]+
              {
                IP = $MATCH
      
                :0 W
                * ! ? /home/aarons/bin/rblcheck -q $IP
                { SPAM = "$SPAM $IP is rbl'd$NL" }
              }
            }
      

      3.10 Software, Spam Be Gone [toc]

      "Spam Be Gone" http://www.internz.com/SpamBeGone/ ...uses machine learning and artificial intelligence technologies to examine incoming mail messages and determine their priority... is more than just a Spam filter, it's a general purpose mail message prioritiser. You train the system, telling it which are good, and which are bad messages. As Spam Be Gone! learns it becomes customised for each individual user.

      W. Wesley Groleau wwgrol@sparc01.fw.hac.com in procmail mailing list comments:

      > They only distribute binaries, and I'm paranoid. Anyone able to
      > convince me it's not really a Trojan Horse to collect addresses of
      > spam-haters or something even worse?

      I did some sleuthing. I am 95% convinced that SpamBeGone is not a front or cover for any spammer(s). To protect the author's privacy, I won't say why I'm convinced or how I got the info. Sorry. If you're paranoid like me, you'll have to do your own sleuthing before you use it.

      I'm also convinced SpamBeGone's theory is sound. I won't judge the implementation until I've used it for a while.

      R Lindberg & E Winnie rlindber@kendaco.telebyte.com in procmail mailing list comments:

      I have to agree with the recent comments about Spam Be Gone, I found it tends to be inaccurate. I first set it up about a week ago, followed the directions and trained it on several (15 to 20) messages. One from each list we get, and the remainder from my logs of SPAM messages.

      The first day it missed about half the SPAM, and nailed about 1/3 of the real messages. So I tuned the key-words a bit, trained it on about 100 more SPAMs and trained it on all the good messages it nailed. Since then it has nailed every SPAM received, however the second day it nailed about 20% of the good messages, which I then trained it to like. Since then it has been nailing about 10% of the good messages, despite continual training. I also added every list to the address book, and it still nails posts from this list, and my wife's lace list.

      I even went through my entire log of SPAM and trained it on every one that didn't come out a 5 (bad). Being the kind of person I am, I also checked after I trained it, and found four SPAMs, the despite my training it that they were bad (5) came out as not so bad (4). I don't dare kill 4's as far too much of my mail (like this list) ends up as 4's.

      For me, this program is not ready for prime time. If the comments are correct that it only learns on Subject and From headers, it's not even worth trying. Since lists use the TO and CC headers to be identified, and there are several excellent other headers (X-Advertisement comes to mind) that would be assests for killing SPAM.


4.0 Procmail pointers [toc]

4.1 Procmail discussion list [toc]

Traffic in this list is about 5-20 messages per day. Do not join if you can't handle that much traffic. The list is run by SmartList, procmail based list software.

submitting questions/answers procmail@informatik.rwth-aachen.de
subscription requests procmail-request@informatik.rwth-aachen.de
digest request procmail-d-request@informatik.rwth-aachen.de

[Ralph Sims ralphs@halcyon.com] Be sure to use SUBJECT field, and send subject "subscribe" to procmail-request .

4.2 To get off procmail discussion list [toc]

To get off the list: send a message to procmail-request with:
      unsubscribe user@domain         in the subject line
      unsubscribe                     first line in the body

If that fails, try email to procmail-owner@informatik.rwth-aachen.de (purportedly that should go to a person).

See also the original subscriptions message that you will received http://www.iki.fi/~era/procmail/welcome.txt

4.3 Procmail Lint service (code check) [toc]

If you have and can use Emacs, please download the Procmail programming mode, tinypm.el, that [jari] has written. Lint is included in there and it can auto-correct mistakes on the fly. You can get it from the mentioned uta ftp site (get tgz kit).

Because not all people know how to use Emacs, how to use Emacs lisp packages or are otherwise clueless about Unix tools, I put up procmail Lint service where you can send your code.

      To: jari.aalto@poboxes.com
      Subject: send pm-lint.hlp

This service is highly experimental and if traffic starts to get too high, I have to close it because every message to the lint starts an background Emacs process and it consumes server resources. The preferred way is that you get your own Emacs package and Lint your code locally. When you send a message to the Lint it will respond to you with message similar to this:

      *** 1997-11-24 22:13 (pm.lint) 3.11pre7 tinypm.el 1.80
      cd /users/jaalto/junk/
      pm.lint:010: Warning, no right hand variable found. ([$`']
      pm.lint:055: Pedantic, flag orer style is not standard hW:
      pm.lint:060: Warning, message dropped to folder, you need lock.
      pm.lint:062: Warning, recipe with "|" may need w flag.
      pm.lint:073: Warning, Formail used but no f flag found.

4.4 Procmail module list [toc]

The UBE stop modules are not listed here. See pointers in "procmail code" section later. The pm-ja*.rc modules are in Jari's file server; all other modules can be found from Alan's file server.

subroutine = A piece of code that gets something in INPUT and responds with OUTPUT. Subroutine is not message specific.

recipe = A piece of code, that is somewhat self containing: It reads something from the message or does something according to matches in message. Recipe may be message specific.

[Header files] These are like #include .h files in C, they define some common variables, but do not contain actual code.

[General]

[Date and time handling] For these, you extract the date from somewhere first and then feed the string to some of these subroutines:

[Date and time handling] You use these recipes to get date directly from the message:

[Forwarding and account modules]

[Vacation modules]

[Message-id based modules]

[Cron modules]

[Backup modules]

[Confirmation modules]

[File Servers]

The same variables have been used for these two servers: you can switch to one or the other by just changing the includerc. Above servers are Plug-in modules: you create a directory, put files there and your server is immediately

[Mime]

[Filtering messages body/headers]

[Misc]

[Mailing lists]

4.5 Plus addressing foo+bar@address.com [toc]

http://www.qz.to/~eli/faqs/addressing.html http://www.faqs.org/faqs/mail/addressing/

[Roy S. Rapoport rsr@macromedia.com] This sort of addressing is implemented using sendmail (well, I'm sure the other MTAs can also do it, but my experience is with sendmail). The last few releases of sendmail (8.8.6, 8.8.7, 8.8.8) all seem to automatically default to allowing it.

Basically, for any address of the form 'foo+baz', sendmail ignores the '+baz' part and just delivers it to foo.

4.6 General procmail pointers [toc]

Procmail is discussed in usenet newsgroup comp.mail.misc .
Binary in ftp://ftp.informatik.rwth-aachen.de:/pub/packages/procmail/

"Archived messages" ftp://ftp.informatik.rwth-aachen.de:/pub/packages/procmail/ Articles from procmail mailing list: covers from 1994-08 to 1995-05 (A .gz file: ~2Meg when uncompressed)

And Latest articles can be found here, hosted by Achim Bohnet Covers from 1995-10 to the present day. ach@mpe.mpg.de. The www page has nice search capabilities. http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail/ http://www.rosat.mpe-garching.mpg.de/~ach/exmh/archive/procmail/

"Era's Procmail faq" http://www.iki.fi/~era/procmail/mini-faq.html http://www.dcs.ed.ac.uk/~procmail/faq/ [mirror] Also available by email, the ITEM can be: links.html, mini-faq.html, procmail-faq

      To: era+pr@iki.fi
      Subject: send ITEM

"Era's Procmail Link collections" http://www.iki.fi/~era/procmail/links.html ...A page with full of good links to the world of procmail

"Catherine's Getting Started With Procmail" http://shell3.ba.best.com/~ariel/nospam/proctut.shtml This is a quick tutorial intended to get a procmail neophyte started using procmail with as little trouble and fuss as possible.

4.7 Procmail Code [toc]

"Alan's procmail library" Send subject "send procmail library" to Alan Stebbens aks@sgi.com http://reality.sgi.com/aks

"pm-code.shar, Jari's Procmail modules" Send subject "send help" or "send ls-l.txt" or "send pm-code.shar" to jari.aalto@poboxes.com. The first one sends the server help file and second sends list of available files. In this server directory you find procmail recipes and Emacs information.

"Elijah's" http://www.qz.to/~eli/src/procmail/rc.master.html

"Concordia scripts" http://alcor.concordia.ca/topics/email/auto/procmail/ ...We provide sample sets of recipes to get you started. The great thing about the concordia scripts is the fact that they are designed to run from a central location and be called from a procmailrc installed in the user's ~/home directory.
webdoc@alcor.concordia.ca

"Meng on procmail" http://icg.resnet.upenn.edu/procmail/ http://res2.resnet.upenn.edu/procmail/ ...goes into exhaustive detail about how I manage my mailing lists

"David's" David Hunt dh@west.net http://www.west.net/~dh/homedir/pmdir/ ...My .procmailrc and .forward files can be viewed at

4.8 SmartList code (mailing list implementation with procmail) [toc]

"Michelle's (SmartList add-ons)" ftp://ftp.fatfree.com/ --> confirm-1.1.tar.gz To add subscription confirmation to smartlist

"The mail-list.com front-end for Smartlist Mailing Lists" ftp://ftp.mail-list.com mark david mcCreary <mdm@internet-tools.com ...This is a front end to Smartlist mailing lists. It provides easy to use addresses for subscribers of the list. It also allows list owners to concatenate many Smartlist commands into one email message, which is then broken up and sent in as separate messages to Smartlist.

These scripts can accept messages for any mailing list, extract the name of the list, and then format the appropriate commands that are then forwarded on to the standard Smartlist list-request address. These scripts do not directly alter the Smartlist dist file.

A discussion group mailing list is available for support. Please send a blank message to front-end-on@mail-list.com to start the subscription process.

4.9 Procmail code to filter UBE [toc]

Sysadms remember : Spam filtering is much more efficiently done in the MTA, especially if you just looking at From and To lines. For example, I you can setup in Exim a rule that blocks \d.*@aol\.com (that is any aol.com local part that begins with a digit). AOL guarrentees that none of their addresses being with a digit. Exim rejects such bogus addresses at the SMTP level before the message is received.

"pm-dsspam.rc" 1997-09-13 Daniel Smith DanS@bristol.com has one exellent spam filter called spamc.rc. It uses some nice heuristics and filters from various people, icluding [david] and [phil]. You can get this first version by sending following message.

      To: jari.aalto@poboxes.com
      Subject: send pm-dsspam.rc

Later Dan made substantial changes to it and the new version is available from ftp://ftp.bristol.nl/pub/users/DanS/spamcheck

"pm-jaube.rc Jari's ube filter (compiled from others)" After Daniel Smith posted his spam recipes to procmail mailing list, Jari investigated them and compiled other recipes to a general purpose UBE module that needs no special setup and can be installed via simple INCLUDERC. No additional ube-list files are used, all UBE all detected happens using procmail rules. The module is included in pm-code.shar.

      To: jari.aalto@poboxes.com
      Subject: send pm-code.shar

"Catherine A. Hampton's Spambouncer" http://www.best.com/~ariel/nospam/ ...The attached set of procmail recipes/filters, which I call The Spam Bouncer, are for users who are sick of spam (unsolicited junk mail email) and want to filter it out of their mail as easily as possible. These recipes can be used as shared recipes for a whole system, or by an individual for their own mailbox only.

"Protect yourself from spam: A practical guide to procmail" http://www.sun.com/sunworldonline/swol-12-1997/swol-12-spam.html ..take you, step by step, through everything you need to know in order to enlist the aid of a Unix host in filtering unwanted e-mail traffic. This page is exellent to get you started with procmail and filtering with simple recipes and how to store messages to folders. Recommended for newcomers to Procmail.

"junkfilter" by Gregory Sutter and Matthew Hunt. http://www.pobox.com/~gsutter/junkmail/ gsutter@pobox.com Junkfilter is a user-configurable procmail-based filter system for electronic mail. Recipes include checks for forged headers, key words, common spam domains, relay servers and many others.

"Download procmail spam filters" http://www.telebyte.com/stopspamr This is exellent site and contains many other spam stop pointers.

"Andy's" andy@neptune.chem.uga.edu [Andy Dustman is a respected remailer operator] ...For the anti-spam procmail recipe, send me mail with subject "spam"


5.0 Dry run testing [toc]

5.1 What is dry run testing [toc]

It means that you call your procmail test script directly with sample test mail
      % procmail $HOME/pm/pm-test.rc < $HOME/tmp/test-mail.txt

The script pm-test.rc has the procmail recipe you're testing or improving. The test-mail.txt is any valid email message containing the headers and body. You can make one with any text editor: see if there is vi, pico or emacs in your Unix system. Below you see a simple test mail skeleton

      From: me@here.com
      To: me@here.com (self test)
      X-info: I'm just testing
      BODY OF MESSAGE SEPARATED BY EMPTY LINE
      txt txt txt txt txt txt txt txt txt txt

And remember that you can define environment variables as well in the dry run call. Here is an example where procmail just executes the script and you does nothing fancy.

      % procmail VERBOSE=on DEFAULT=/dev/null ~/pm/pm-test.rc < /dev/null

5.2 Why the From field is not okay after dry run [toc]

It now says "From foo@bar Mon Sep 8 14:38:06 1997"

[phil] Don't worry about this. It's a side-effect of running the message through formail after having generated any auto-reply -- the auto-reply generated by "formail -rt" doesn't have a "From " header (it's pointless for outgoing messages), so the second formail adds one, not knowing that it'll just be ignored by sendmail later (well, sendmail will extract the date from it, but that's ignorable). You only see it because you're saving to a folder instead of the mailing it.

5.3 Getting default value of procmail variable [toc]

[david] There's always this way to learn a variable's initial value (note the strong quotes), which Stephen uses to get procmail's value for $SENDMAIL in the scripts that build SmartList:
      procmail LOG='$PATH' DEFAULT=/dev/null /dev/null < /dev/null

Since LOGFILE hasn't been defined, $PATH will be printed to the screen. One caution: if there are any variables in the definition of $PATH (such as $HOME), they'll be expanded in the output.


6.0 Things to remember [toc]

6.1 Get newest procmail [toc]

Lot of troubles surface only because you have old procmail version. Be sure to have the latest which was 1997-09-15 3.11pre7. Here is command to check your procmail version. nock your sysadm or ISP until he installs this version; don't give up, if you're serious about using procmail.
      % procmail -v

6.2 Csh's tilde is not supported [toc]

Real csh or Emacs freak has accustomed to using tilde(~) everywhere, but drop that habbit now. Procmail doesn't support it; just use $HOME. And when you write prcmail recipes, think sh not csh . The mind change will automatically get your brain tuned to right programming habbits.

6.3 Be sure to write the recipe start right [toc]

The recipe starts with :0 or just with : but the latter one is somewhat dangerous and easy to miss. Beware writing it 0: as it happens easily. The Procmail code checker, Lint, also requires that you use the :0 recipe start convention.

[phil] Always put a zero after the colon that begins the recipe. In the first versions of procmail, you would put the number of conditions, with a default of 1. That was annoying, and the computer can do the counting easier, so Stephen made it so that a count of 0 indicates that the condition are all the lines beginning with a '*'. The default is one, unless the 'a', 'A', 'e', or 'E' flags is given, in which case the default is zero. ALWAYS START RECIPE WITH :0.

6.4 Always set SHELL and PATH variables [toc]

[faq] If your login shell is a C shell (csh or tcsh), prepare for havoc As a precaution, always put following to the top of your .procmailrc.
          SHELL = /bin/sh

[jari] And it is very likely that the default PATH environment variable that your .procmailrc sees it not enough. To play safe; so that all the needed binaries can be found when escaping to shell in procmailrc, set the PATH variable as a very first statement. Here
is one exaple that should work in HP-9 HP-10 and in SUN-OS. You can add paths that don't exist, that way you can port the .procmail to other server (From HP to SUN as I do)

      #   Prepend these paths to the PATH
      #
      PATH        = $HOME/bin:\
      /usr/contrib/bin:\
      /bin:/usr/bin:/usr/lib:/usr/ucb:/usr/sbin:\
      /usr/local/bin:/opt/local/bin:\
      /vol/bin:/vol/lib:/vol/local/bin:${PATH}

6.5 Keep the log on all the time [toc]

It's best that you put these variables at the very start of your procmailrc. When you start using procmail, you also want to know
all the time what's happening there and why didn't your recipes work as expected. The answer to almost all your questions can be found from the log file.
      LOGFILE     = $PMSRC/pm.log
      LOGABSTRACT = "all"
      VERBOSE     = "on"

6.6 Never add trailing slash for directories [toc]

[phil] Drop the trailing slash: it'll choke if you ever end up on Apollo's DomainOS where double slashes are network references. If directory has a trailing slash, it may apparently will choke mkdir() on some OSes (they treat it like "/.").
      DIR         = "/full/path/to/www/directory/"    # Wait...
      FILE        = $ARCHIVEDIR/file                  # Ouch !

6.7 Remember what term "delivered" means [toc]

[alan] When procmail delivers a piece of mail, whether to a file or a pipe-command, if the write succeeds, then the mail is considered to have been delivered, and processing stops with that recipe file.

Here is the relevant text from man page:

...There are two kinds of recipes: delivering and non- delivering recipes. If a delivering recipe is found to match, procmail considers the mail (you guessed it) deliv- ered and will cease processing the rcfile after having successfuly executed the action line of the recipe. If a non-delivering recipe is found to match, processing of the rcfile will continue after the action line of this recipe has been executed.

6.8 Beware putting comment in wrong place [toc]

You like commenting a lot; sticking them to everywhere possible? Yes, I do that too, and got into trouble because you're not that free to comment to code in procmail. Pay attention to the following example
      :0          # comment, nice tune...
      * Ditch-it  # AUTCH, Autch, autch. This comment must not be here!!
          #         Hm, Old procmail versions don't understand this
          #         Are you sure you want to put comment inside
          #         Condition line?
      * ditto
      {   # recipe block start
          # Comment on line by itself
          :0
          /dev/null
      }

So, the place to watch is the condition line. Some later procmail version promised to correct this misfeature, but it never came true. No such procmail exist yet that would allow putting comments after condition clause.

6.9 Brace placement [toc]

Be carefull with your braces and remember that the old procmail versions aren't as forgiving as newest version. Below you see classical "Test OK condition first, and if that fails then do something else". See the side comments.
      :0
      * condition
                          # No space allowed here!
      {}                  # Wrong, Must include _one_ empty space
      :0 E
      {do_something }     # Again mistake, must have surrounding spaces

6.10 Local lockfile usage [toc]

Lockfiles are only needed when procmail is doing something that should be serialized, i.e., when only one process at a time should be doing it.

This generally means that any time you write to a file, you should have a locallock, preferably based on the name of the file being written to. Forwarding actions ('!'), and 99% of all filters don't need lockfiles. However, if a filter action writes to a file while filtering, then you may need a lock. To demonstrate this rule, consider the recipe above: it doesn't need a lockfile because no file was written to, and multiple formails can run at once.

Beware misplacing the lock colon(:)

       :0: a      # Autch! Wrong...
       :0 a:      # Okay.

Note that in delivering recipes where you manually write the content, you must use local lockfile with > token, because procmail can't determine lock by itself. It can only determine the lockfile from the >> token. [stephen] Indeed, However, putting a lockfile on a recipe like this is, of course, utterly useless. So you might as well omit the locking entirely.

      #   Save last body of message to this file
      :0 b:  mail.body.lock
      | cat > mail.body

[phil] Watch this too. A nesting block that does not launch a clone cannot take a local lockfile on the recipe that starts the braces. A nesting block that does launch a clone can. (see the error)

      :0: file.lock
      {
          #  error: "procmail: Extraneous locallockfile ignored"
          #  - This lock file will be ignored
          #  - If the recipes inside the braces try to use file.lck
          #    as  a lockfile, then you'll have a deadlock situation.
          :0:
          /tmp/tmp.mbx
      }

Let me also explain why the w is so important. Notice, that the two here are equivalent. The W here is implicit. NOTE: his is only true on the recipe that opens a nested block. On a recipe with a program, forward, or delivery action, W' is different from w is different from missing both.

      :0 c: somefile      :0 Wc: somefile
      { ... }             { ... }

To quote the comment in source code, "try and protect the user from his blissful ignorance". The parent will always wait for the cloned child to exit when a lockfile is involved. The only question is whether or not it should be logged. If you want failure of the cloned child to be logged, then you should use the w flag, ala:

      :0 wc: somefile
      { ... }

A local lockfile can be used to lock a clone; the parent procmail will remove it when the clone exits (thus it serves as a global lockfile for the clone). If the braced block does not launch a clone, asking for a local lockfile generates an error.

6.11 Global LOCKFILE [toc]

[david] If you want to block everything while the recipe runs, even during the conditions, use global lock. For example in the this construct the formail; which updates the message-id cache file must be protected with a global lockfile.
      LOCKFILE = $MID_CACHE_FILE.lock
      :0
      * ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE
      {
              LOG="dupecheck: discarded $MESSAGEID from $FROM
      "
              :0
              $DUPLICATE_MBOX
      }
      LOCKFILE

You cannot use local lockfile as below:

      :0 : $MID_CACHE_FILE.lock
      *   ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE

because the local lockfile named on the flag line will be created only if the conditions have matched and the action is attempted. One more note: watch carefully, that there is no ":" lock when delivering to DUPLICATE_MBOX; because the outer global lockfile already prevents all other procmail instances from executing this part of the recipe.

6.12 Gee, where do I put all those ! * $ ?? [toc]

Ahem. I can't tell you exactly what to do or how to write your own procmail recipes, but I can tell how I'm writing them. Here is my condition line syntax
      *$ [!] BH VAR ?? test

That won't say much unless I give you something to compare with. Here is one perfectly valid rule, but not my style

      :0
      * $ ^Subject:.*$VAR
      *! ^From:.*some
      *B ! ?? match-the-string-in-body
      *  VARIABLE ?? set
I prefer lining up things in the condition lines. The first column is reserved for dollar sign, the second for not operator and so on. The important thing is that I can see at a glance if I have set the variable expansion dollar in the line (leftmost).
      :0
      *$      ^Subject:.*$VAR
      * !     ^From:.*some
      * ! B ?? match-the-string-in-body
      *        VARIABLE ?? set

6.13 Sending automatic reply [toc]

Do not send automatic reply back without checking "! FROM_DAEMON" condition and always include X-Loop header and test against it to prevent mail loops
      :0
      *   conditions-for-auto-reply
      * ! X-Loop: $MY_XLOOP
      * ! FROM_DAEMON
      | $FORMAIL -A "X-Loop: $MY-XLOOP" ...other-headers...

6.14 Avoid extra shell layer (check command for SHELLMETAS) [toc]

[dan] It is very important to study your shell command calls and try to save the overload of the extra layer of shell. It may be extra work once when you write your rcfile but it saves effort on each piece of arriving email. When procmail sees a character from $SHELLMETAS, it runs
      # Default SHELLMETAS: &|<>~;?*[
      # Default $SHELLFLAGS: -c
      % $SHELL $SHELLFLAGS "command -opts args"

instead of

      % command -opts args

That is because procmail's ability to invoke other programs does not include filename globbing ("[", "*", "?"), backgrounding ("&"), piping ("|"), succes- sion (";"), nor dependent succession ("&&", "||"). If it sees any of those characters (before expanding variables), it hands the job over to a shell.

Sometimes those characters appear in arguments to a command without having their shell metameaning and procmail really could invoke the command directly without the shell. You can see the distinction in a verbose logfile: if procmail runs the command itself, it logs

      Executing "command,-opts,args"

with a comma between each two positional parameters, but if it calls a shell, the original spacing from the rcfile is repeated unchanged in the logfile:

      Executing "command -opts args"

So, if you know you won't be needing shell expansion, wrap your shell calls with this:

      savedMetas  = $SHELLMETAS
      ..command that does not need shell expansion features..
      SHELLMETAS  = $savedMetas

6.15 Think what shell commands you use [toc]

For every message procmail launces the processes you have put into your procmailrc. If you haven't payed attention to optimisation before, now
it's serious time to take a magnifying glass and check every recipe and the processes in them. When you write you private shell scripts, the performance hit is not so important, but for mail delivery, the matter is totally different. First, let's see some programs and sizes: The following is from HP-UX 10, where the binaries seem to include debug and symbol table code.
      131072 Aug 21  1996 /usr/bin/awk
      196608 Oct  1  1996 /usr/bin/sort
      245760 Jun 10  1996 /usr/bin/grep
      262144 Jun 10  1996 /usr/bin/sed
      303552 Dec  7  1995 /usr/local/bin/gawk
      544768 Jun 10  1996 /usr/contrib/bin/perl
      822232 Aug 25 13:58 /opt/local/bin/perl5.00401
              text    data     bss
      awk:    72727 + 51316 +  15317   = 139360
      sort:  173225 + 18496 + 183076   = 374797
      sed:   237248 + 16992 +  56252   = 310492
      grep:  221591 + 16176 +  53816   = 291583
      perl4: 502220 + 36044 +  65632   = 603896
      perl5: 633812 + 69612 +   2385   = 705809
      gawk:  160018 +  5264 +   7168   = 172450

The binary sis above is not the typical case, see the sizes in another system [from era]

           4 Sep 28 14:25 /usr/local/bin/awk -> gawk
       32768 Nov 16  1996 /usr/bin/grep
       49152 Nov 16  1996 /usr/bin/sed
      114688 Oct 20  1996 /usr/local/contrib/gnu/bin/grep
      155648 Nov 16  1996 /usr/bin/awk
      155648 Nov 16  1996 /usr/bin/nawk
      221184 Nov 16  1996 /usr/bin/gawk
      311296 Jan 27  1997 /usr/local/bin/gawk
      958464 Nov  2 16:34 /usr/local/contrib/bin/perl
      1196032 Sep 14  1996 /usr/local/bin/perl

Stan Ryckman stanr@sunspot.tiac.net wants you to know that:

Comparing byte sizes on disk means nothing here... these things may or may not have been stripped. Any symbol tables included in the byte counts you see above won't affect process start-up time.

The "size" command will give a better handle on what will be needed in starting a process. The three segments may each have their own overhead, though, and the relative contributions of those segments to startup time may well be system-dependent.

Hm. Can we draw some conculusion? Not anything definitive, but at least something:

Here are some more programs. Don't even think of extracting fields with grep or awk, like "grep Subject", because formail is much more smaller and optimid for tasks like that.

      37007 Sep  5 15:53 /usr/local/bin/formail   # 3.11pre7
      28672 Jun 10  1996 /usr/bin/tr
      20480 Jun 10  1996 /usr/bin/tail
      20480 Jun 10  1996 /usr/bin/cat
      20480 Sep 26  1996 /usr/bin/expr
      16384 Jun 10  1996 /usr/bin/head
      16384 Jun 10  1996 /usr/bin/cut
      16384 Jun 10  1996 /usr/bin/date
      16384 Jun 10  1996 /usr/bin/uniq
      16384 Jun 10  1996 /usr/bin/wc
      12288 Jun 10  1996 /usr/bin/echo

6.16 Using absolute path when calling a shell program [toc]

Shell programmers know that if you use absolute path when you call the executable, shell doesn't have to search through long list of directories in $PATH. This may speed up shell scripts remarkably. The correct way to use such optimisation is to define variables to those programs.

Hm, should you use such optiomisation in your procmail code? That's two folded question and I....would say yes and no. Consider how many shell calls you have? Do you use grep or formail a lot? then you could optimise these calls. To be portable, define variables to executables:

      #  perhaps defined in separate INCLUDERC
      #
      #   INCLUDERC = $PMSRC/pm-mydefaults.rc
      #
      $FORMAIL    = /usr/local/bin/formail
      $GREP       = /bin/grep
      $DATE       = /bin/date
      :0 fh
      | $FORMAIL -rt

And when you port your .procmailrc to different environment that has different paths, you could use this recipe in addition to one just mentioned above:

      #   You could as well use $LOGNAME or $USER to test the account
      #   where your procmail is running.
      #
      MY_PRIMARY_HOST = "helene"
      :0
      *$ HOST ?? $MY_PRIMARY_HOST
      {
          # This is not my "primary" account, don't rely on absolute
          # paths here.
          #
          $FORMAIL    = formail
          $GREP       = grep
          $DATE       = date
      }

7.0 Procmail flags [toc]

7.1 The order of the flags [toc]

Order does not matter of course, but here is one suggestion. The idea here is that the most important flags are put to the left, like giving priority 1 for aAeE, which affect the recipe immedately. Priority 2 has been given to flag f, which tells if recipe filters somthing. Also (h)eader and (b)ody should immediately follow f, this is considered priority 3. In the middle there are other flags, and last flag is c, which ends the recipe, or allows it to continue. In addition according to [david]: "...I'm quite sure that putting anything other than the opening colon and the number to the left of A, a, E, or e will cause an error."
      :0[aAeE]f[hbHB][DwWir]c: LOCKFILE
         |    | |     |     |
         |    | |     |     (c)opy flag last.
         |    | |     Other flags
         |    | The header/body flags next
         |    (f)ilter flag to the left, before hb
         The 'process' flags first. Signify (A)ND or (e)rror
         recipe.

You can write the flags side by side

      :0Afhw:MYLOCK
Or, as i prefer, leave flags on their own slot.
      :0 Afhw: MYLOCK

7.2 Flag w and recipe with | [toc]

[alan] If the filterprogram exits with a 0 status (0 == okay), then procmail will replace the original input body with the output of the filterprogram. If the filterprogram exits with anything but zero, procmail will report an "error" to the log, and "recover" the input (not filter it)

[david] I am very sure that that's the case only if you have the w or W flag on the filtering recipe. Without w or W, procmail won't care about a bad exit status from the filter and will replace the filtered portion with whatever standard output the filter produced. It may still report an error to the log but it won't recover the previous text. This, for example, will destroy the body of a message, even without i:

      :0 fb
      | false

With this, however, procmail will recover the original body:

      :0 fbW      # same results even if we add i
      | false

[stephen] No, not on all occasions. Procmail will not care about the exitcode here. However, if procmail detects a write error, it will recover (because of the missing 'i' flag). Procmail will only detect a write error in such a case if the mail is long enough and does not fit in the pipe buffer that's in the kernel (typically 10KB).

7.3 Flag w, lockfile and recipe with | [toc]

[manual] In order to make sure the lockfile is not removed until the pipe has finished, you have to specify option w otherwise the lockfile would be removed as soon as the pipe has accepted the mail. So if you see anything that looks like ">" or ">>" in your recipe, then that should immediately ring your bells. immediately ceck that you have included the w flag and the lockfile :.
      :0 hwc:
      * !^FROM_MAILER
      | uncompress headc.Z; cat >> headc; compress headc

7.4 Flag f and w together [toc]

The w tells Procmail to hang around and wait for the script to finish. [Wouldn' you think this ought to be implied by the f already?]

[david] Of course the f flag is enough to make procmail wait for the filter to finish, but the w means something more: to wait to learn the exit code of the filtering command. If sed fails with a syntax error and gives no output, without W or w procmail would happily accept the null output as the results of the filter and go on reading recipes for the now body-less message. On the other hand, with W or w sed will respond to a non-zero exit code by recovering the unfiltered text.

7.5 Flags h and b [toc]

[david] hb is the default; you need to use h only when you don't want 'b' or vice versa. You can think of it this way: h means "lose the body" and b means "lose the header," but the two together cancel each other out.

[phil] hb (or bh) is the default for actions. You need to specify h without b if you want the action applied only to the head. H is the default for conditions. You need to specify HB or BH if you want to test a condition against the entire message.

7.6 Flag i and pipe flag f [toc]

Flag i is useless in mailbox deliveries.

[faq] following will work some of the time, when the message is short enough, but that's a coincidence. With a longer message, though, Unix starts paying attention to what is happening, because it will have to buffer some of the data, and then when the buffered data is never read, an error occurs. The error is passed back to Procmail, and Procmail tries to be nice and give you back your original message as it was before this malicious program truncated it. Never mind that in this case you wanted to truncate the data. Anyway, the fix is easy: Just add an :i flag to the recipe (`:0fbwi' instead of :0fbw) to make Procmail ignore the error.

      :0 fbw
      * condition
      | malicious-pipe

[dan] ere's why the i' flag is needed (courtesy of Stephan): You told procmail to filter the entire mail (header and body), so it does and it attempts to write out header and body to the filter. Then procmail notices that not the entire body is being consumed. Procmail, being rather paranoid when it comes to delivery of mail assumes something went wrong and considers this a failure of the filter.

      :0 fbwi
      | head -$N

7.7 Flag r [toc]

[phil] Procmail automatically turns on the 'r' flag for deliveries to /dev/null, so there's no need to do it yourself.
      :0 r        # leave out the r
      * condition
      /dev/null

[david] You can use the r flag (for raw mode) on every recipe where you do not want a From_ line added. I'm assuming that there isn't one already there; the r flag keeps procmail from making sure that there are a From_ line at the top and a blank line at the bottom, but it will not make procmail remove them if they are already present. Also, be careful to use the -f option on all calls to formail so that formail won't add a From_ line.

Someone who didn't need From_ lines -- I forget who -- found it annoying to put r onto every recipe and altered the source to prevent procmail from adding From_ lines at all, ever. I think a better idea would be a procmailrc Boolean to enable or disable them for all recipes without affecting other users. (Then perhaps we'd need a reverse-r flag to undo raw mode for one recipe at a time?)

7.8 Flag c's background [toc]

Interesting. My vision of this more to think of CONTINUE with message processing afterwards even if conditions matched. [david] Precisely: when you have braces, thinking "continue" instead of "copy" or "clone" can get you into trouble.

Early versions for procmail, before braces and before cloning, called the c flag "continue" in their documentation; I think it is still called that in the source.

When Stephen introduced braces (but not cloning at this point), it was of course implicit that an action line of "{" was non-delivering, and a c was extraneous. People put c's there because they wanted procmail to continue to the recipes inside the braces on a match, and procmail brushed it off with an "extraneous c-flag" warning. No harm done.

When Stephen introduced cloning, though, I was rather upset that he was giving double duty to c instead of introducing something new like C for it, especially because people who absolutely wanted no clone but intended the recipes inside the braces to run in the same invocation of procmail as everything else were mistakenly putting c's on their braces to make sure procmail would "continue". People would (and did) get double deliveries.

Roman Czyborra, though, said that if you consider c to stand for "copy", that covers both uses of c: provide a copy to a simple recipe or, if there are braces, to a clone procmail that will handle the recipes inside the braces. Stephen agreed and changed the documentation accordingly.

Longtime users of procmail and people who read old docs may still think of it as "continue", but since the introduction of clones, that is not a good way to look at it. "Copy" is much safer.

7.9 Flag c before nested block forks a child [toc]

[alan] The combination of a nested block and the c flag causes procmail to fork a child process for the nested block, while the parent skips over it and continues on. The child process doesn't necessarily stop unless a delivering recipe (without the 'c' flag) action succeeds.

7.10 Flags before nested block [toc]

Given the following recipe, let's examine the flag part
      :0 FLAGS
      {
          do-something
      }

[david] H, B, A, a, E, e, and D affect the conditions and thus are meaningful when the action is to open a brace. H, B, and D would be meaningless, of course, on any unconditional recipe, but they should not cause error messages.

Generally, flags that affect actions are invalid there, and b, h, f, i, and r always are, but the others are partial exceptions: if you are using c to launch a clone, then w, W, and a local lockfile can be meaningful. If there is no c, then w, W, and a local lockfile are invalid at the opening of a braced block.

7.11 Flags aAeE tutorial [toc]

[david] A, a, E, and e are mutually exclusive and no more than one should ever appear on a single recipe. [phil] Actually, this is not true. e does not work with E or a (and procmail gives a warning if you try), and A is redundant if a is given, but at least some of the other combination make sense and work.

These mnemonics might help:

      # [phil] demonstrates e
      #
      # match, but action fails
      :0
      /etc/hosts/foo
      # no match
      :0 A
      * -1^0
      /dev/null
      # this is skipped because the last tried recipe didn't match
      :0 e
      { whatever }

How they interact with one another when used consecutively has not been fully tested to my knowledge. Consider this:

      :0
      * conditons
      non-delivering action1
      :0a
      action2
      :0e
      action3

Is action3 done if action2 failed or if action1 failed (or perhaps in both situations)? [phil] Action 3 is only done if action2 failed.

If the answer is action2, does this work to get action3 done if action1 failed? I think it does, but does it also run action3 if the conditions didn't match on the first recipe? [phil] Yes, and yes.

      #   [david]
      :0
      * conditions
      non-delivering action1
      :0a
      action2
      :0E
      action3

[phil] If that's not what you want, combine some flags:

      :0
      * conditions
      non-delivering action1
      :0 Ae
      action3
      :0 a
      action2

If the conditions match, action1 will be executed. action3 will then execute if action1 failed, otherwise action2 will be executed [if action1 succeeded].

[david] I know what this structure does because I use it:

      :0
      * conditions
      non-delivering action1
      :0A
      action2
      :0E
      non-delivering action3
      :0A
      action 4

If the conditions match, action1 and action2 are performed and action4 is not (of course action3 is not either), even if action2 is non-delivering; if they fail, action3 and action4 are performed. The A on the fourth recipe refers back to the third and no farther. But I don't know about this:

      :0
      * conditions
      non-delivering action1
      :0A
      * more conditions
      action2
      :0E
      non-delivering action3
      :0A
      action 4

Now, suppose the conditions on the first recipe match but those on the second recipe do not match. Would the third recipe (and thus the fourth one) be attempted? I would expect so. [phil] Yes. The last tried recipe didn't match, therefore the E flag will be triggered. If that isn't what you want, you can prevent it this way:

      :0
      * conditions
      {
        :0
        non-delivering action 1
        :0
        * more conditions
        action2
      }
      :0E # ignores mismatch inside braces, looks only at same level
      non-delivering action3
      :0A
      action4

If that is what you want, you can be positive this way:

      # if action2 is non-delivering or vulnerable to error that
      # would cause fall-through
      DID2
      :0
      * conditions
      non-delivering action1
      :0A
      :0
      * ! DID2 ?? .
      non-delivering action3
      :0A
      action4
      # if action2 is delivering and sure to succeed
      :0
      * conditions
      non-delivering action1
      :0A
      * more conditions
      action2
      :0
      non-delivering action3
      :0A
      action4

[phil] or those who are interested, I'll note that there are only 3 combinations of the a, A, e, and E flags that aren't either illegal or redundant. They are 'Ae', 'aE', and 'AE'. I've shown a use for 'Ae' up above. Here's an example of 'AE':

      :0
      * condition1
      non-delivering action1
      :0 A
      * condition2
      non-delivering action2
      :0 AE
      action3

action3 will only be executed if condition1 matched but condition2 didn't match. Without the A flag, action3 would be executed if either of them failed. This can also be done with a instead of A with analogous results.

Procmail's "flow-control" flags may not be particularly easy to describe in straight terms (and this can all be made more complicated by throwing in a more varied mix of delivering vs non-delivering recipes), but I've found that it usually does what I expect it to do, and when it doesn't or I'm in doubt or I want to be particularly clear, I can always fallback to doing it explicitly via nesting blocks. Pick your poison...


8.0 Matching and regexps [toc]

8.1 Matches are not case sensitive [toc]

Okay, okay; if you read the manual you knew that already. But sometimes someone with years of experience with Unix may take it for granted that procmail would be case sensitive as the rest of the unix tools. It isn't so, in procmail's case.

8.2 Procmail uses multiline matches [toc]

Procmail uses multiline matches by default. This means that ^ and $ match a newline, even in the middle of a regexp. Now you know this, you can easily interpret eg. $[^>] as: `a newline followed by a line not starting with a >'.

8.3 Headers are folded before matching [toc]

If you have header that continues on separate lines, you don't have to worry about the linefeeds. Procmail silently folds the header onto one line, before matching it
      Received: from unknown (HELO Desktop01) (208.11.179.72) by
          palm.bythehand.net with SMTP; 4 Dec 1997 23:29:09 -0000
      :0                          # note, match on continuation line
      * ^Received:.*bythehand\.

8.4 Improving Space-Tab syndrome [toc]

[david] Procmail doesn't know the escape code \t, usually known as tab character, and it doens't know \n ant its friends either.
      #  Not what you think       # Use this, spacee + tab
      [ \t]                       [   ]

But using the space+tab is not very readable and it's very error prone construct. I suggest using following to improve the readability:

      WSPC   = "    "         # whitespace = space + tab
      SPC    = "[$WSPC]"      # regexp whitespace, the short name
                              # SPC was chosen because you use this
                              # a lot in condition lines.
      NSPC  = "[^$WSPC]"      # negation of whitespace
      #   match anything ecxept space and tab
      #
      *$ var ?? $NSPC
      #   Don't do this however because the leading () generated by
      #   $\ will trip you up.
      #
      *$ var ?? [^$\WSPC]
      #   match anything ecxept space and tab and newline
      #
      *$ var ?? ($NSPC|$)

But you cannot use newline inside bracket. Following wont' work. Notice that you cannot use escape code \n. Pay attention to the quotes used.

      WSPCL  = "   "'
      '
      :0
      *$ var ?? [$WSPCL]

Instead use variable syntax:

      WSPCL = "( |       |$)"      # space + tab + dollar

Handling exclamation chararcter

[phil] you do need the first backslash, to keep procmail from considering the backslash as a request to invert the sense of the match. For example, these two conditions are equivalent:

      * ! 200^1 foo
      *   200^1 ! foo

Therefore, a leading '!' must either be backslashed, enclosed in either parens or brackets (I suspect that parens would be more efficient), or prefaced with an empty pair of parens. I would recommend writing the condition with one of these:

      * 200^1 \!!!!
      * 200^1 ()!!!!
      * 200^1 (!!!!)

8.5 Rules for generating a character class [toc]

In a "character class" (things between "[" and "]"), metacharacters don't need to be escaped.. Well, a backslash is an exception. eg. [$[^\\] would match any one of the literal characters dollar, opening bracket, caret, and backslash.

[elijah] If you are inverting a character class "first" means just after the(^). So the character class that contains everything but ] ^ and - must look like this:

      [^]^-]

[david] What if I want literal $ inside bracket? A $ inside brackets, unless it begins a variable name and the "$" modifier is on, always means a literal dollar sign. It cannot mean a newline if it appears inside brackets. A good way to keep it exempt from "$" inter- pretation is to put it last inside the brackets (unless one also need to include a literal hyphen and one can't put the hyphen first; then you'll need to escape the dollar sign with a backslash and put the hyphen last -- well, you could alternatively escape the hyphen, I guess), because procmail knows that "$]" cannot possibly be a reference to a variable.

General guideline:

8.6 Matching space at the end of condition [toc]

[david] If you need to have tab or pace at the end of condition line you can use these. The third is probably the one procmail can handle the most efficiently
      * rest of string .*
      * rest of string[ ]
      * (rest of string )
      * rest of string ()
      * rest of string( )

[phil] From my looking at the source, the last two should be equal in efficiency, and except for a trace difference in regcomp time, should match at the same speed as a solitary trailing blank. The character class version [ ] will be slower.

Of course, I suspect that neither you nor your sysadmin will ever notice the difference in speed, and given that 99% of all systems are I/O bound and not CPU bound, the system is incredibly unlikely to notice either. I can't complain though, as I also go to various extremes to seek out every last bit of possible performance. Ah well. The first one is would be slower yet, though perhaps no slower than the bracket form.

8.7 Beware leading backslash [toc]

I am trying to come up with a procmail recipe that among other things should have the condition 'body does not contain a particular word'. Here is what I tried:
      * ! B ?? \<word\>

[david] You have fallen into the leading backslash problem, If the first character of a regexp is a backslash, procmail takes it as "end of leading whitespace" and strips it. What you coded means "a less-than sign, then the word, then any non-word character." (It also prevents the less-than sign fro being taken as a size operator.) Unless the non-word character immediately to the left of the word was a less-than sign, that regexp would fail (and thus the condition would pass). Try this:

      * ! B ?? ()\<word\>

This would work too:

      * ! B ?? \\<word\>

but in a casual reading it would look like "literal backslash, less-than sign, the word, word boundary character," so we on the list generally recommend the empty parentheses.

Do note that the difference in meaning of \< and \> in procmail (where they must match a non-word character) from their meaning in perl and egrep (where they match the zero-width transition into and out of a word respectively) does not come into play here. Because procmail's \< and \> can match new- lines (both real and putatitve), it rarely is a factor. It's a problem only when a single character has to serve both as the ending boundary of one word an also the opening boundary of another. Well, it's also a problem when you have one as the last character to the right of \/, but that's easily solved.

8.8 Correct use of TO Macro [toc]

8.9 Procmail's regexp engine [toc]

[phil] procmail's regexp engine has no special optimization for anchoring against the beginning of the line. Most program that have such an optimization have it because they need the line distinction for other reasons (for example, grep by default prints the entire line containing a match). Procmail has no such other reason, so it treats newline like any other plain character in the regexp. There should be no speed difference as long as procmail can say: "the first character I see must be a 'foo'". Note that case insensitivety is handled by making everything lowercase, so a letter being first doesn't bring in the spectre of character-classes or anything like that.

> recipe may have just changed the size of the head, procmail
> cannot keep a byte-count pointer nor a line-count pointer to
> where the body begins but must scan through the head to find the
> blank line at the neck before it begins a body search.

Procmail does this when it reads in the head, not when it goes to search the body, so that cost can't be avoided. Let me repeat; that searching the body is no slower than searching the header, if we forget the minimum impact of the sis of these two.

8.10 Undesrtanding procmail's minimal matching (stingy vs. greedy) [toc]

I want to have a procmail recipe that will save certain mail to folders where the folder name (always a number) is specified in the subject.
      :0
      * ^Subject: *\/[0-9]*
      $HOME/folders/$MATCH

[phil]...and this won't quite work. For a subject with a space after the tab, the '*' on the left hand side will be matched minimally (zero times), and then the stuff on the right hand side will be matched maximally, but starting at the space still, which will match nothing. This is a case were procmail's minimal matching can cause massive confusion and frustration. The solution is usually the following:

FORCE THE RIGHT HAND SIDE TO MATCH AT LEAST ONE CHARACTER

By Changing the recipe to:

      :0:
      * ^Subject: *\/[0-9]+
      $HOME/folders/$MATCH

it'll work, because then the left hand side will have to match all the way up to the first digit (but not the digit itself). If you follow the rule in caps then you'll almost always be able to ignore procmail's weirdness in this area.

[david] And examine how procmail matches "Subject: Keywords 9999"

      * ^Subject:.*Keywords.*\/[0-9]*
      procmail: Match on "^Subject:.*Keywords.*\/[0-9]*"
      procmail: Matched ""

The right side was as greedy as it could be; the problem is that we seem to expect greed on the left as well. MATCH is set to null, in contrary to our expectation. It is not a bug but rather a frequently misunderstood effect of the way extraction is advertised to operate.

Remember that only the right side is greedy; the left side is stingy, and left-side stinginess takes precedence over right-side greed. Extraction is implemented this way: the entire expression, left and right, is pinned to the shortest possible match; then the division mark is placed and the right side is repinned to the longest possible match starting at the division. The tricky part is to remember that the division is marked during the stingy stage. If the expression is

      ^Subject:.*Keywords.*\/[0-9]*

and the text is

      <newline>Subject:<space>Keywords<space>9999<newline>

then the shortest possible match to the entirety is

      <newline>Subject:<space>Keywords

because ".*" and "[0-9]*" both match to null. Then the division mark is placed on the space after "Keywords" and procmail looks for the longest possible match to [0-9]* starting with that space. That, again, is null, so MATCH is set to null.

We see that it works as expected if regexp is changed to this:

      ^Subject:.*Keywords.*\/[0-9]+

That is a whole other ball of wax. Now the shortest match to the entirety is

      <newline>Subject:<space>Keywords<space>9

and the division mark is placed at the 9. Then procmail refigures the longest match to the right side starting at the division mark and sets MATCH=9999. However here

      ^Subject:.*Keywords\/.*[0-9]*

the second ".*" would have reached not just up to the digits but through them to the end of the line. MATCH would contain the rest of all of it matched to ".*" plus null match "[0-9]*".

[for curious reader]

Given line

      Subject: Keywords 9999

the second, which differs only by inserting the extraction marker, would not match and would not set $MATCH:

      ^Subject: Keywords *9999        # matches ok
      ^Subject: Keywords *\/9999      # won't !

because the left side would be matched to "<newline>Subject: Keywords" and the immediately following text, " 9999", did not match the right side. It would actually make the condition fail and keep the recipe from executing. It took a lot of circuitous coding to allow for not knowing in advance exactly how many spaces there would be before the digits.

Call it counterintuitive, but it's not a bug. General advice: always make sure that the right side cannot match null or that the last element of the left side cannot match null. Or in other words: force the right-hand side of the \/ to match at least one character.

8.11 Explaining \/ and ()\/ [toc]

[david] \/ with nothing to the left of it means "one foreslash". To start a condition with the extraction operator, use ()\/ or \\/; the latter looks counter intuitively like "literal backslash and literal foreslash" (as it would mean if it appeared farther along in the regexp), so most of us prefer the former.

8.12 Explaning ^^ and ^ [toc]

[phil] Procmail doesn't think lines when it matches; but it concatenates all lines together and then runs the regexp engine. This may be a bit surprising, but consider following where we want to discard any message that is likely a html advertise
      #   Body consists entirely of html code
      #   something which'll match any message which has "<html>"
      #   in the body
      :0B:
      *$ $s*<html>
      html.mbox

The condition test is applied to the entire body. If you want to limit it to match only against the beginning of the body, you have to say so using the ^^ token, as you discovered. A simple line anchor (^ or $) just says that there must be a newline (or the beginning or end of the area being searched) at that particular point in the text being matched. notice the leading anchors below.

      #   trap spam where the *very* first line of the body started with
      #   <html>
      :0B:
      *$ ^^$s*<html>
      html.mbox
What, exactly, does "Anchor the expression at the very start of the search area..." ie. the ^^ ?

[dan] Technically, an opening ^^ anchors to the putative newline that procmail sees before the first character of the search area (and a closing ^^ anchors to the putative newline that procmail sees after the end of the search area). When the search area is B, that is a point equivalent to the second of the two adjacent newlines that enclose the empty line that marks the end of the head.

The reason I'm bringing that up is this: if there are multiple empty or blank lines between the head and the body, ^^ will mark the start of the second of those lines, not the start of the first line of the body that contains some text.

So if you want to test whether <pattern> is the first printing text in the body, even if it is not necessarily flush left on the very first line, you might need a condition like following, where there is space/pipe/tab/pipe/dollar.

      *$  B ?? ^^$SPCL*<pattern>
MATCH strips all leading blank lines in 3.11pre7

8.13 Procmail and egrep differences [toc]

[By david]
          ^Subject:(.*\<)?humor\> # Wrong would be: ^Subject:.*\<humor\>

8.14 ORing by using De Morgan rules [toc]

[Tim Pickett tbp@cs.monash.edu.au] I thought I'd point out that there are a few ways to do a logical OR of conditions. Someone posted a solution here that involved using procmail's scoring system, but I figured you could do it without scoring by taking advantage of De Morgan's rule:
      a or b = not (not a and not b)

or mathematically:

      a || b <=> !( !a && !b )

Or in procmail syntax, if the "and" condition version is

      :0
      * condition1
      * condition2
      action
      rest_of_file

[david] The flaw is that rest_of_file may have more instances of ORing, so the nesting of braces will get very hairy to maintain (and to get right in the first place). The lesser flaw is that, without a folding editor, you're in trouble for maintenance, because the ORed conditions and the action on them are separated by great amounts of other text in your rcfile.

Here's a way to do ORing

      :0
      * ! condition1
      * ! condition2
      { }
      :0E
      action_on_condition1_or_condition2

and then you don't have to put the rest of the rcfile inside braces nor save the OR action for long, long afterward (where you'd need a folding editor to see the action and the ORed conditions together).

8.15 ORing with scoring [toc]

Using scoring is easy, onece any of the conditions match, the score gets positive value and recipe succeeds. Idea by Erik Selke selke@tcimet.net

[era comments] ...allegedly the scoring system is going to cost you more than plain old regex matching. Floating-point math and all that, even if you use extremely simple scoring. Thus, it would probably be slightly more efficient to do it De Morgan way.

      :0:
      * 1^0 condition1
      * 1^0 condition2
      some.mbox

8.16 ORing traditionally [toc]

      #  This is just simple OR case. Thre are some cases where it's
      #  impossible to OR conditions with this style.
      #
      *  condition1|condition2
      some.mbox

9.0 Variables [toc]

9.1 Setting and unsetting variables [toc]

You have already set variables with the "=" syntax. Variable names are case sensitive: var is different from VAR
      VAR = /var/tmp  # directory
      VAR = "this"    # literal
      VAR = 1
      VAR = $FOO      # another.
      VAR = "$VAR at" # combined with previous value

Unsetting a variable is done like this

      VAR             # now it doesn't exist any more
      VAR=            # same but with old style
      VAR = ""        # Variable is said to be "null" now

And you can put multiple assignments on the same line

      VAR=1  VAR=2  VAR=3

Examine following, which are both equivalent. My opinion is that the first one is more redable, because it reminds more shell scripts syntax. The backticks will not require a shell in the absence of any SHELLMETAS, so either of these will not spawn a shell

          #   We Don't care if file exists this time...
          VAR = `cat file`

But, naturally it is matter of style which one you want to use. Below, the first one needs a bit more characters: backquotes and the surrounding braces, while the latter doesn't.

          #   Considered "modern" and in par with other sh programming
          #   languages
          :0
          * condition
          { VAR = `cat file` }
          #   oldish, and procmail specific and there has been reported
          #   errors if you use this construct.
          :0
          * condition
          VAR =| cat file

9.2 Variable initialisation and sh syntax [toc]

Procmail borrows some sh syntaxes for variable intialisations. Note that sh's ${var:=default} and ${var=defaultvalue} syntaxes are not available in a procmail rcfile.

And here are the classic usage examples

      VAR = ${VAR:-"yes"}     # set VAR to default value "yes"
      VAR = ${VAR+"yes"}      # If VAR contains value, set "yes"

Ever wondered if this calls `date` in all cases?

      VAR = ${VAR:-`date`}

No, procmail is smart enough to skip calling date if VAR already had value. It doesn't evaluate whole line. Below you see what each initialising operator does. Study it carefully

      VAR = ""                # Define variable
      VAR = ${VAR:-"value1"}  # VAR = "value1"
      VAR = ""
      VAR = ${VAR-"value2"}   # VAR = ""
      VAR = ""
      VAR = ${VAR:+"value3"}  # VAR = ""
      VAR = ""
      VAR = ${VAR+"value4"}   # VAR = "value4"
      # Note this:
      VAR = "val"
      VAR = ${VAR:+"value3"}  # VAR = "value3"
      VAR = "val"
      VAR = ${VAR+"value4"}   # VAR = "value4"

      VAR                     # kill the variable
      VAR = ${VAR:-"value1"}  # VAR = "value1"
      VAR
      VAR = ${VAR-"value2"}   # VAR = "value2"
      VAR
      VAR = ${VAR:+"value3"}  # nothing is assigned
      VAR
      VAR = ${VAR+"value4"}   # nothing is assigned

And if you want choose from several inital values, use recipe below instead of standard var = ${var:-"value"}.

      :0
      * VAR ?? ^^^^
      {
          #   no value (or was empty), set default value here based on
          #   some guesses
          VAR = "base-default"
          :0
          * condition
          { VAR = "another-default" }
          ...more conditions..
      }

You could also use equivalent, but less readable condition line in previous recipe:

      *$ ${VAR:+!}

It works, because if variable has value the line expands to

       * !

Where "!" is procmail "false" operation. One more way to do the same would be, that we require at leastone character to be present. You could use also regexp (.), which would require at least one character to be present, but you might not like matching pure spaces.

      * ! VAR ?? [a-z]

9.3 Testing variables [toc]

If possible, write in positive tense, not in negative, like below:. If test is in effect, TEST_FLAG is set to "no" to prevent executing a recipe.
      :0
      * ! TEST_FLAG ?? yes
      * condition

And below is the same but now it uses positive tense. To my opinion, this is more readable. You're free to disagree with me at this point, but all in all, it's more nicer to look at code that has as few ! flags as possible, especially in variable tests.

      :0
      *  TEST_FLAG ?? no
      *condition

[phil] The following fails if the variable is unset or null.

      * variable ?? .

That was why I'd be beter to use

      *$ variable ?? [^$WSPC]

Or

      * variable ?? (.|$)

to require that variable contain at least one charactr. But neither is a way to check whether a variable is set or not, because each treats a null variable the same as an unset one. This is the best way I know to check whether a variable is set or not:

      *$ ! ${VAR+!}

[gsutter@pobox.com] Here is yet another way to test if variable is set and if it isn't, sets it to a default value.

      :0
      *$ ! VAR^0
      { VAR = "value" }

9.4 What is construct $\VAR in procmail 3.11 [toc]

[era and david] Procmail 3.11, $\VAR will escape regex metacharacters. It should produce a suitably backslash-escaped expression for Procmail's own use. In addition $\VAR will always begin with leading empty parentheses.

You can't pass the $\VAR construct to shell programs, because there is that leading paretheses. Here is recipe to standardize the regexp. You can pass SAFE_REGEXP to external programs like sed.

      PROCMAIL_REGEXP = "$\VAR"
      :0
      * PROCMAIL_REGEXP ?? ^^\(\)\/.*
      { SAFE_REGEXP = "$MATCH" }

[era] Note that this is slightly inexact; Procmail will backslash-escape according to Procmail's needs, not sed's. For example, Procmail doesn't think braces are magic (although that would be nice to have in Procmail as well) whereas many modern variants of sed do.

9.5 Common pitfalls when using variables [toc]

Procmail is picky and forgives nothing. Here is some of the favourite mistakes one can do
      $EMAIL  = foo@site.com      # Done Perl lately? Remove that $
      # Erm, this is ok, but many procmail recipe writes want to
      # take extra precautions and include the regexps in parenthesis.
      # So, maybe (yabba|dabba|doo) would be more safe
      #
      REGEXP  = "yabba|dabba|doo"
      :0
      *  Subject:.*$REGEXP  # Hey, you missed the '*$'
      :0
      *$  $REGEXP ?? hello  # surely you meant '* REGEXP ?? hello'

9.6 Quoting: Using single or double quotes [toc]

Pay attention to this:
      VAR = "you"
      NEW = 'hey "$VAR"'  # won't extrapolate $VAR; you get literal
      NEW = "hey '$VAR'"  # extrapolates to: hey 'you'

You can even combine separate words together

      VAR = "1 ""and"" 2" # same as "a and 2"

Don't let these many quotes disturb you, just count the beginning and ending quotes. Superfluous here, but you may need some similar construct somewhere else.

      VAR = '1 '"'"'and'"'"' 2'  # same as: 1 'and' 2

9.7 Quoting: Passing values to a external program [toc]

Remember to include the double quotes when you send variables' values to the shell programs. Below, it you see a mistake, because the content of the SUBJECT is not avalilable from ARGV[1]
       SUBJECT = `$FORMAIL -xSubject:`
      :0
      * condition
      | perl-script $SUBJECT       # mistake; use "$SUBJECT"

There is also another way. If your script can access environment variables (almost all orograms can), then you do not need to pass the variables in command line. Above, the SUBJECT is already in environment and in perl you can get it.

      $SUBJECT = $ENV{'SUBJECT'};

Next, do you know what is the difference between these two recipes?

      :0
      | "command arg1 arg2 arg3"
      :0
      | command arg1 arg2 arg3

You guessed it. The first one quotes the entire command and does not do the right thing, the latter is correct and depending on the content of argN variables, they may need quotes.

9.8 Passing values from an external program [toc]

External program cannot set procmail variables directly, your program must write the values to external files and then read the values from these files. Capturing only one value is easy:
      var = `command`      # capture STDOUT

But if program modifies the body and exports some status information about the body conversion, you use this:

      LOCKFILE    = $HOME/.run.lock  # protect external file writing
      valueFile   = $HOME/tmp/values
      #   modify body, and export status values to external file: one
      #   value in every line
      #
      #       VALUE1
      #       VALUE2
      #       VALUE3
      #
      :0 fb
      | command -export-status $valueFile
      values = `cat $valueFile`
      # Derive values from each line
      #
      :0                              # line 1
      *$ values ?? ^^\/[^$NL]+
      { var1 = $MATCH }
      :0                              # line 2
      *$ values ?? ^^.*$\/[^$NL]+
      { var2 = $MATCH }
      :0                              # line 3
      *$ values ?? ^^.*$.*$\/[^$NL]+
      { var3 = $MATCH }
      LOCKFILE    # Release lock

In this case you could pass option "-export-status" to tell where to output the values. You can similar option to our ownb programs, but in most cases the output files are in fixed locations.

9.9 Incrementing variable by N [toc]

[dan and phil] [Rik Kabel rik@netcom.com] Here is recipe for incrementing variable by N. If $VAR is not a number, we get an error Note that, if $VAR + $N is not greater than 0, this will not change the value of VAR if the assignment happens inside braces. You must place the assignment after the closing curly brace.
      :0
      *$ $VAR ^0
      *$ $N   ^0
      { }
      VAR = $=

9.10 Comparing values [toc]

It's too expensive to call shell's test function to do [-lt|-eq|-gt] because you can do the same with procmail. The do-something below is run if SCORE <= MAXIMUM. The recipe simply substracts SCORE from MAXIMUM and determines if the result is positive.
      :0
      *$ -$SCORE   ^0
      *$  $MAXIMUM ^0
      {
          do-something
      }

[era] it's getting slightly cumbersome if it's between MIN and MAX:

      :0
      *$   $SCORE ^0
      *$  -$MIN   ^0
      {
          :0
          *$ -$SCORE  ^0
          *$  $MAX    ^0
          { }
          :0
          suitable
      }

9.11 Strings: How to strip trailing newline [toc]

Suppose you have previous used "\/[0-9]+$" regexp, which left newline($) in the MATCH. If you wonder why the recipe works, remind yourself that regexp operator "." never matches a newline.
      :0
      * NBR ?? ^^\/.+
      { NBR = $MATCH }

9.12 Strings: Getting partial matches from string [toc]

[dan] Getting a match to the right is quite easy with procmail's match operator \/
      VAR = "1234567890"
      :0
      * VAR ?? ()\/3.*
      { rest = $MATCH }       # rest = 34567890

but deleting 2 characters from the end is nearly impossible without forking an outside process. The cheapest might be expr because it doesn't need a shell to pipe echo to it (as sed would and I believe perl would):

      #   by resetting the shellmetas, this will only call
      #   expr. If we wouldn't have fiddled with shellmetas,
      #   this would have called two precoesses: sh + expr
      #
      saved       =   $SHELLMETAS
      SHELLMETAS
      VAR         =   `expr "$VAR" : '\(.*\)..'`
      SHELLMETAS  =   $saved

ksh or bash could do it as well:

      #   semicolon to force invoking a shell, actually
      #   first question mark will force a shell already.
      #
      saved       = $SHELL
      SHELL       = /bins/sh
      VAR         = `echo ${VAR%??} ;`
      SHELL       = $saved

Now, if you know that the last two characters will be "90", that's different. Of course, this totally screws up if the third-to-last character is a 9.

      :0
      * VAR   ?? ()\/.*[^0]
      * MATCH ?? ()\/.*[^9]
      { VAR = $MATCH }

[jari] Coments: If shell must be used, then awk is a good tool for simple string manipulation. Its startup time is faster that perl's whose overhead is due to internal compilation. awk also consumes less resourses overall than perl. Following will only work if VAR is a string of continues block of characters. (ARGV[1] can be used)

      VAR = ` awk 'BEGIN{ v = ARGV[1];                                \
              print substr(v,1,length(v)-2); exit }'                  \
              "$VAR"                                                  \
            `

This version requires some file, any file, so that we get awk started. In the previous code all the work was done in the BEGIN block and no file was ever opened.

      VAR = ` awk '{print substr(v,1,length(v)-2); exit }'            \
              v="$VAR" /etc/passwd                                    \
            `

[dan] comments awk: expr is sure to be a smaller binary than awk for procmail to fork, and it needs much less command-line code to do this job. Note also that one still has to diddle with SHELLMETAS to avoid a shell, because the awk code contains brackets; thus it doesn't replace all.

9.13 How to raise a flag if the message was filed [toc]

      FILED = !
      :0 c:           # We process the message more
      * condition
      foo
      :a
      { FILED }
      ...
      :0              # Stop if previous cases filed the message
      *$ $FILED
      { HOST = "_done_" }

Or alternatively: procmail automatically sets LASTFOLDER is it delivers message to mailbox.

      LASTFOLDER
      :0 c:
      * condition
      foo
      :0 c:
      * condition
      bar
      ... et cetera ...
      :0
      *$ ${LASTFOLDER+!}!
      { HOST = "_done_" }

9.14 Dollar sign in condition line [toc]

      This doesn't seem to work for me...
      * ^TO$\foo@bar.com
[david] An unescaped dollar sign later in the line represents a newline, so what you have there is searching for the following:
  1. An expression that matches the expansion of the ^TO token (which is anchored to the start of a line by its definition), followed by
  2. A newline, followed at the start of the next line by
  3. "foo@bar" [the backslash escapes the f, which didn't need escaping], followed by
  4. any character that is not a newline (the period is unescaped), and finally
  5. "com".

Try this instead:

      :0:
      *$ ^TO$\foo@bar\.com
      foo

In fact, to avoid matches to things like foo@bar.community.edu, you might want to do it this way:

      :0:
      *$ ^TO$\foo@bar\.com\>
      foo

9.15 Finding mysterious foo variable [toc]

I have my fellow worker's procmail code and he uses variable FOO that I can't find in his anywhere. It's not shell variable either, because it's literal. Where does it come from?

Your procmail runs /etc/procmailrc when it starts, please check that. It may define some common variables already for all users.

9.16 How to OR recipes with procmail [toc]

[dan] A simple ORing can be accomplished with
      * (pattern1|pattern2)

Likewise, two exit code tests can often be ORed like this

      * ? command1 || command2

But there are many situations where two tests cannot be ORed by combining them into one condition:

How can I make OR conditions that all use the SAME action? I want to be able to test for a number of variants on certain requests, all in one block

[hal] Yes, this can be easily done

      CASE = 0
      :0
      * case 1 tests
      {
          CASE = 1
      }
      :0E
      * case 2 tests
      {
          CASE = 2
      }
      :0
      * ! CASE ?? 0
      {
          # real work, perhaps with explicit tests on CASE
      }

N.B. the above test is a regexp, so if you have more than 9 cases, you'll have to ensure that whatever test you use doesn't mess up.

9.17 Storing code to variable [toc]

One way to run complex code in procmail recipe is first to store it to variable. Idea by [era]. You could do this in separate shell script too. The following example reads URLs from the body of message: the URLs have been put to separate lines and some special Subject is used to trigger the dumping of the html pages
      #   Code by [era]
      #
      COMMAND='while read url; do
          case "$url" in
            *://*)
              lynx -traversal -realm -crawl -number_links "$url" |
              $SENDMAIL -oi $LOGNAME
              ;;
          esac
      done'

      #  Notice the trailing semicolon after eval !
      :0 bw
      * ^Subject: xxxxx
      | eval "$COMMAND" ;

If you want to run the code inside nested block, then look carefully, there are double quotes around the command in backticks. If you leave double quotes out, then # each word in SH_CMD would be interpreted separately.

      $SH_CMD = '$echo "$VAR" >> $HOME/test.tmp'
      :0
      * condition
      {
          # condition satisfied; run the given shell command
          # and do something more.
          #
          dummy = `"$SH_CMD"`
          ..rest of the code..
      }

Similar construct works for message echos too.

      MESSAGE='Thank you so much for your message.
      Unfortunately, the volume of mail I receive .... (blah blah blah).
      If your matter is urgent, try calling +358-50-524-0965.
      '
      :0 hw
      * ! ^X-Loop: moo$
      | ( $FORMAIL -rt -A"X-Loop: moo" ; echo "$MESSAGE" ) |\
       $SENDMAIL -oi -t

9.18 Getting headers to a variable [toc]

[david] Here are several ways to get the entire header into a variable:
      HEADER = `$FORMAIL -X ""` # The space after the X is vital.
      HEADER = `sed /^$/q` # also writable as   HEADER=`sed /./!q`
      :0 h
      HEADER=|cat -

will save the entire header into one variable. It has to be smaller than $BUFSIZE, though. This way might work as well, and will require no outside processes if it does:

      :0              # H flag is implicit
      * ^^\/(.+$)*$
      { HEADER = $MATCH }

9.19 Converting value to lowercase [toc]

If you know that a word belongs to set of choices, you can do this inside procmail
      LIST = ":word1:word2:word3:word4"   # Colon to separate words
      WORD = "WORD1"
      :0
      *$ LIST ?? :\/$WORDzfa
      {  WORD = $MATCH }

But if you don't know the word or string beforehand, then this is the generalized way. [idea by era and david]

      :0 D
      * WORD ?? [A-Z]
      { WORD = `echo "$MATCH" | tr A-Z a-z` }

10.0 Suggestions and miscellaneous [toc]

10.1 Speeding up procmail [toc]

10.2 See the procmail installation's examples [toc]

Did you remember to look at the examples that come with procmail? If not, it's time to give them a chance to educate you. Here is one possible directory you could take a look. Ask from your sysadm if you can't find the directory where to look into.
      % ls /usr/local/lib/procmail-3.11pre7/examples/

Or if you're really anxious to get on your own, try this. The directory /opt/local is for HP-UX 10 machines and the forward contains example how to define your .forward for procmail.

      % find /opt/local/ -name "forward" -print

If the find succeeded and found the file, then you know where the procmail files installation directory is.

10.3 Printing statistics of your incoming mail [toc]

If you keep the procmail log crunching, it will recors to which folder the messages was files. The re is program mailstat which can process the procmail.log file and print nice summary out of it. If you generate the summary at midnight and clear the log, you get pretty nice per day/per folder traffic analysis.
      # -m merges all error messages into a single line
      % mailstat -km procmail.log

10.4 Storing UBE mboxes outside of quota [toc]

I want to store spam outside disk space. Problem: if I tell procmail to deliver to, say, /tmp/spam.box, it does so just fine (according to the log). Unfortunately, it delivers to /tmp on the mail host which I cannot access. spam.box doesn't appear in the /tmp directory of the shell machine when procmail is invoked for incoming mail.

[phil] Under the most likely configuration of sendmail in this situation, it is impossible to have procmail invoked by sendmail on the shell machine: sendmail is probably set to just forward all mail to the designated mail delivery machine.

There are other options: you could temporarily store the mail in your account, then have a cronjob on the shell machine that reprocesses the message. That would probably be more efficient than having each message trigger an rsh to the shell machine. If you actually get enough spam that it's pushing against your quota, then the rsh is too expensive -- use a cronjob that invokes something like:

      cd your-maildir     &&
      lockfile spam.lock  &&
      test -s spam        &&
      {
          cat spam >> /tmp/spam.box && rm -f spam spam.lock || \
          rm -f spam.lock;
      }

WARNING: the above assumes the following:

If the latter two of those conditions isn't true OR IF THEY MIGHT CHANGE then you should use formail -s to break the message apart and invoke procmail on each one separately.

[era] Many sites cross-mount directories for various reasons. /tmp is always local but /var/tmp might be cross-mounted between the login host and the mail host; another one to try is /scratch -- and if all else fails, ask your admin to set up an NFS share for this purpose.

10.5 Gzipping messages [toc]

[Sean B. Straw PSE-L@mail.professional.org] On the recipe delivery line where you'd normally be tossing it into a folder do this instead:
      |gzip -9fc >> $MAILDIR/mail.mbox.gz
This will compress each message as it comes in (and since most are TEXT, it does a fine job - MIME, OTOH is one of the best ways to mailbomb someone since it doesn't compress well - but the indirect bombing via mailing lists doesn't do this), reducing the disk space required, usually dramatically. Done in conjunction with something like the following at the end of your .procmailrc, you could have a header file you could quickly rummage through looking for valid messages to add to a procmail recipe, then run:
      gzip -d -c mail.mbox.gz | formail -s procmail -m recipe.rc

(note that if the recipe delivers into the checkme.gz file on any condition, then you should look to MOVE the file before running this process, and use the moved version. In fact, this would be a good idea anyway, as newly delivered mail may appear in the end of the gzip file while you're doing this - and since your ultimate goal is to be able to eliminate junk, you'll want to know that after you've processed a gzipped mail file, you can delete it without accidentally whacking new mail).

      # If not handled already. You might add to this a condition
      # checking to see if the message has certain list
      # characteristics.
      :0
      * LASTFOLDER ?? ^^^^
      {
          # Save the message in case we need to retrieve it.
          :0 c:
          |gzip -9fc >> $MAILDIR/mail.mbox.gz
          # copy headers for easy browsing - including being able to
          # identify lists you're being subscribed to.
          :0 h
          mail.mbox.log
      }

10.6 Using first 5-30 lines from the message [toc]

[era] The regex to grab few lines (or all of them, if there are less than fifty) is not going to be very pretty, but it saves launching an extra process.
      :0 B
      * $ ^^$WSPCL*\/$NSPC.*$(.*$)?(.*$)? ... etc, the rest of the lines
      { toplines =  $MATCH }

The skipping of whitespace at the beginning of the message is of course not necessary. You should probably set LINEBUF reasonably high if you grab many lines, say 30: 80*30 = 2400 bytes; probably setting it to 8192 or 16384 is a good idea, depending how much you want to match. The above gets ugly quickly, so

        #  But if N=30, sed ${N}q if you don't have head
        :0 Bi
        { toplines = `head -$N` }
        :0a
        * toplines ?? pattern
        {  do-it }

10.7 Using cat or echo in scripts [toc]

I have seen a lot of examples that use 'echo', i.e.,
      :0
      * condition
      | echo "first line of message" ; \
        echo "second ..." et cetera

I started out with spam.rc from "ariel" which got me into the habit of

      :0
      * condition
      | cat file_containing_message

although I note that spam.rc did have one recipe using the echo method. What are the reasons for choosing each method over the other?

Here is a comparison table. Choose the one you think is best for you

10.8 How to run an extra shell command as a side effect? [toc]

I was once wondering what would be the wisest way to send messages to my daily "biff" log file about the events that happened during my .procmailrc execution. This is how [david] commented on my ideas
      # case 1: print to BiffLog
      dummy = "`echo message: $FROM $SUBJECT >> $biff `"

[david] Problems you get no locking on the destination file, and unless you put it inside braces you have to run it on every message unconditionally. (Also procmail tries to feed the whole message to a command that won't read it, but the remedies for that don't help very much.)

      # case 2: We consume delivering recipe and therefor have to use
      #        c flag.
      :0 whic:
      | echo message: $FROM $SUBJECT >> $biff

This it locks the destination file and you can add conditions to it, so it's probably the best. If the head or the body is less than one bufferful, you can limit the unnecessariy written data with h or b, but I think that in most OSes a partial buffer and a full one are the same amount of effort.

      # case 3: We use side effect of "?" here. Cool.
      :0
      *  condition
      *  ? echo message: $FROM $SUBJECT >> $biff
      { }         # procmail no-op

We have conditions possible, but there is no locking on the destination file. I'd go with method #2 or a variation thereof:

      #   we dont' mecessarily need w
      :0 hic:
      * condition
      | echo message: $FROM $SUBJECT >> $biff
      #   Or you could use this
      :0 hi:
      * condition
      dummy=| echo message: $FROM $SUBJECT >> $biff

Now, when [david] has explained how various ways differ from each other, I present the recipe where I used the case 2. When I was dropping a message to a folder I wanted to send a message to biff log too. The idea is that the drop-conditions have already matched and then we run extra command by using side effect of "?" token. As far as the recipe is concerned, the "?" is a no-op. The pedantic way would have been to add the LOCKFILE around to the recipe, but imagine 50 similar recipes like this...and you understand why the LOCKFILE was left out. It's only necessary if you worry about sequential writing to the biff file.

      :0
      * drop-condition
      * ? echo message: $FROM $SUBJECT >> $biff
      $MBOX

10.9 Forcing "ok" return status from shell script [toc]

...the "?" trick only allows running some additional shell commands (echo is one that always succeeds) while conditions above have already determined that drop will take place. And you can always make condition to succeed with:
      * ? misbehaving-shell-script || echo

If "misbehaving-shell-script" always return failure exit code.

[david] If the script always returns a failure code, just do this:

* ! ? misbehaving-shell-script

The more complex case is a script that can return either success or failure but you don't care which; if the drop conditions passed, you want to run the action line.

echo can also fail if the process lacks permission or opportunity to write to stdout. A more reliable choice is true(1); its purpose in life is to do nothing but exit with status 0.

The command : is a shell builtin which always returns true status. Not exactly more readable than true(1) "|| :" will save the invocation of true (unless true is built into $SHELL), but procmail will still run a shell. On the other hand, as long as the command itself has no characters from $SHELLMETAS, a weight of 1^1 and no "|| anything" will avoid the shell process as well.

However, there is yet a better way to make sure that a failure by the script doesn't make procmail abort the recipe:

      :0 flags
      * other conditions
      * 1^1 ? shell-script
      action

Regardless of the exit status of the script, the condition will score 1 and not interfere with procmail's decision about the action line of the recipe. Weighted exit code conditions behave like this (see the procmailsc(5) man page):

      * w^x ? command

scores w on success or x on failure.

      * w^x ! ? command

scores the same as this:

      * w^x  pattern_that_appears_in_the_search_area_$?_times

10.10 Make your own .procmailrc available to others [toc]

There is never too much to learn about procmail and the best source is the rc files that people have done. Remember to comment your procmailrc file well before you put it available. Below is recipe
for sending the .procmailrc upon request. If you want to send anything more that one or two files (many times you want to put other files available too), then please do not use this code but a general file server modules found from the 'Procmail module list'
      :0
      * !^Subject:.Re:
      *  ^Subject:.*send.*procmailrc
      * !^FROM_DAEMON
      {
          :0 fhw:
          | $FORMAIL -rt -A "Precedence: junk" \
                    -I     "Subject: Requested .procmailrc";
          :0 ahic
          | ( cat - $HOME/.procmailrc ) | $SENDMAIL -io -t
          :0              # trash the "Send procmailrc" request
          /dev/null
      }

10.11 Using dates efficiently [toc]

Note: See module list, where you will find "date" and "time" parsing modules. You can also parse the date from the first Received or From header if it is the same each time in your system. That would be orders of magnitude faster and decreases your system load if you receive lot of mail.

Calling date in your procmail script many times is not a good idea. Use the MATCH as much as possible to be efficient in procmail, like below where we call date only once. If you are not in the same time zone as your server, and you want an accurate report of the date, you might amend the invocation to the following:

      date = `TZ="KDT9:30KST10:00,64/5:00,303/20:00";date "+%Y %m %d"`

The basic recipe is here

      # By Rik Kabel ?rik@netcom.com
      # add %H:%M%S if you want these as well
      #
      date = `date "+%Y %m %d"`
      :0
      * date ?? ^^()\/....
      { YYYY = $MATCH }
      :0
      * date ?? ^^..\/..
      { YY=$MATCH }
      :0
      * date ?? ^^.....\/..
      { MM=$MATCH }
      :0
      * date ?? ()\/..^^
      { DD=$MATCH }
      TODAY   = "$YYYY-$MM-$DD" # ISO std date: like 1997-12-01

A cron job could also generate the date.rc file:

      #!/bin/sh
      YYYY=`date +%Y`
      YY=`date +%y`
      MM=`date +%m`
      DD=`date +%d`
      PMSRC=$HOME/.procmail
      OUT=$PMSRC/date.rc
      #  The content of the file will br
      #
      #   TODAY=11.97
      #
      echo "YYYY=$YYYY"           >$OUT
      echo "YY=$YY"               >>$OUT
      echo "MM=$MM"               >>$OUT
      echo "DD=$DD"               >>$OUT
      echo "TODAY=$YYYY-$MM-$DD"  >>$OUT
      # end of script

And your .procmailrc includes statement

      INCLUDERC = $PMSRC/pm-mydate.rc

10.12 Keep message backup, no matter what [toc]

It's good to have some safety measures in your .procmailrc. Although you were expert and checked your recipes 10 times, there is still chance that something breaks. One morning, when you browse your HIN header log; you notice "Hm, there is that interesting message but it was not filed, where is it?". And when you go to study the procmail logs (you do keep the log going all the time) and it hits you: "Gosh; mistake in my script! Message was fed to malicious pipe and I had that i flag there... sniff ". And you pray you would have backed up the message in the first place.

So, before your procmail does anything to your message, put the message into some folder which is regularly expired. For example I use Emacs Gnus to handle the expiring for me, you may want to use cron job instead. It's time to relax; your email is now safe after this. (*big* phoooof from me when I did this ) Alternatively: Wink Be A Real Man and don't use backups

      $SPOOL      = $HOME/Mail/spool
      #   Backup storage
      #   - This could be directory too, in that case you could use
      #     cron job to expire old messages in regular intervals
      #   - For once a day expiration, see procmail module list
      #     and pm-jacron.rc
      $BUP_SPOOL  = $SPOOL/junk.bup.spool
      :0 c:
      $BUP_SPOOL

Naturally you can filter out the mailing list messages from the backup, because losing one or two of them may not be that serious. Maybe you could use two backup spools, one for your mailing lists list and the other for your private messages.

      :0 c:
      * ! mailing-list1|mailing-list2
      $BUP_SPOOL

10.13 Keep simple header log [toc]

Here is a simple strategy: Record all what comes in and record all what happened to that message. See how the info is constantly recorded to HIN , header-in, folder. You can now check the HIN log every day to see if the messages were sunk to right folders: Remember to add HIN rule to every recipe, so that the sink message [sunk-somewhere] is recorded after incoming message headers.

I use this one-liner log in my Emacs window which is updated by live-find-file process all the time (See the Emacs tools section later). It gives a nice overview of email messages the I'm receiving: it's my biff(1) equivalent in Emacs.

      #  Give Name pm*  for all procmail files
      #
      $NULL   = $SPOOL/junk.null.spool    # /dev/null is dangerous
      HIN     = $PMSRC/pm-in.hdr          # "(H)eader (in)box"
      # formail and other settings
      # - (F) prefix used for formail variables
      # - New formail knows the "z" option, which zips out
      #   the whitespaces
      #
      FFROM   = `$FORMAIL -zxFrom:`        # var FROM might be a regexp macro
      FSUBJ   = `$FORMAIL -zxSubject:`
      FID     = `$FORMAIL -zxMessage-Id:`
      TODAY   = "$YY-$MM-$DD $HH:$MM"     # ISO format
      # ............................................. incoming ...
      #  record log of incoming mail
      #
      :0 hwic:
      |  echo "$TODAY $FFROM $FSUBJ" >> $HIN
      # ......................................... null recipe ...
      # spam-like addresses - let friends@planetall.com fall through
      #
      :0
      * From:.*(remove|delete|free|friend@)
      {
          #   Record what happened to this mail.
          :0 hw:
          |  echo "  [null-AddrReject]" >> $HIN
          :0 :
          $NULL               # And sink it to folder
      }

10.14 Emergency stop for your .procmailrc [toc]

I find out that whenever I am testing a new recipe and procmail runs into a loop it will then start sending me mail messages on the average of about 5 per second. I then have to quickly run over to my procmailrc and start disabling my individual "control" recipe files.
Yet I figure, in situations like this where every second is important, there must be a better way

[alan] This is quite easy already; put this at the top of your procmailrc:

      :0
      * test -f $HOME/.procmailrc.stop
      {
          EXITCODE= 75        # Means: retry later; requeue
          HOST    = _stopped_by_external_request_
      }

Then, when testing your procmailrc, you can simply do following to disable your procmailrc filtering.

      % touch $HOME/.procmailrc.stop

11.0 Scoring [toc]

11.1 Using scores by an example [toc]

First make all the needed matches and let the SCORE value to be set. Examine the score after the final value has been calculated. The condition lines say:
      # Idea by 26 Sep 97 Stephane Bortzmeyer bortzmeyer@pasteur.fr
      #
      :0
      *     -250^0
      * ^Subject:\/.+$
      *       50^1    MATCH ?? [!]
      *       50^1    MATCH ?? [$]
      *  100^1  MATCH ?? ()\<(free|sex|opportunity|money|great)\>
      *     -250^0   ^Subject: *(Fwd|Fw|re):
      * B ?? 100^0    !!!
      { }             # official procmail no-op
      SCORE = $=      # Score has been calculated
      :0 fhw
      | $FORMAIL -i "X-Spam-Score: scored $SCORE"

      :0:             # If score had positive value, sink message
      *$ $SCORE^0
      junk.spam.mbox

Given the following subject:

      "Great opportunity for free sex; no money required!!!!"

procmail scores it this way: ! was found 4 times (200/weight 50), "free|sex..." regexp matched 4 times (400/weight 100).

               condition score    Total sum so far
                          ----    ----------------
      procmail: Score:    -250    -250 ""
      procmail: Score:     200     -50 "[!]"
      procmail: Score:       0     -50 "[$]"
      procmail: Score:     400     350 "^Subject:.*\<free|sex|...
      >"
      procmail: Score:       0     350 "^Subject: *(Fwd|Fw|re):"
      procmail: Score:       0     350 ! ""
      procmail: Assigning "SCORE=350"

[david] Some notes on possible regexps and their differences:

      * 100^1 ^Subject:.*\<(free|sex|opportunity|money|great)\>

That condition says to score 100 for every subject line that contains any of those five words ... not to score 100 for every one of those words in the subject, but 100 for every subject line that contains any of those words. So it will never score more than 100 unless there are multiple subject lines. You see, it offers five alternative regexps:

      ^Subject:.*\<free\>
      ^Subject:.*\<sex\>
      ^Subject:.*\<opportunity\>
      ^Subject:.*\<money\>
      ^Subject:.*\<great\>

Offhand, I think regexp below would score 400: 100 for "Subject.*free" and 100 for "sex" etc. Of course, the score might be higher if other lines in the head included the strings "sex", "opportunity", "money", or "great<word border>", but appearances of "<word border>free" outside the subject wouldn't be counted.

      * 100^1 ^Subject:.*\<free|sex|opportunity|money|great\>
      [translates to]
      ^Subject:.*\<free
      sex
      opportunity
      money
      great\>

And this one would score 400 too. How? $MATCH would contain whole subject and there would be non-overlapping matches to " great ", " opportunity ", and " free ". If we got rid of either or both of the word-border marks, it would score 500.

      Subject: Great opportunity for free sex; no money required!!!!
      * 100^1 MATCH ?? ()\<(free|sex|money|opportunity|great)\>

11.2 Brief Score tutorial [toc]

[elijah] If you're serious about using scores, please spend a minute by reading this short example.
      VERBOSE = "yes"
      :0
      *  1^1 foo
      * -2^2 bar
      { }
      a = $=
      :0
      *  1^1 foo
      * -2^2 bar
      {
        :0 f
        | echo Whee: fun ; cat -
      }
      b = $=
      :0
      *  1^1 foo
      * -2^2 bar
      {
        whee = "fun"
      }
      c = $=
      :0
      /dev/null

Then if you would send a message

      From foo Fooof
      To: bar
      Subject foobar
      body-something-here

The log file will tell you what happened.

      procmail: [20175] Fri Sep 26 10:25:23 1997
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "a=-3"
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "b=0"
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "c=-3"
      procmail: Assigning "LASTFOLDER=/dev/null"
      procmail: Opening "/dev/null"
      From foo Fooof
        Folder: /dev/null 46

11.3 Score's scope [toc]

If you have a delivering recipe and score is positive, the action lines are executed. If the score is <=0, then the $= information is lost. Notice that the score variable is not available out of this {} block
      :0 condition-related flags
      * conditions
      {
          LOG = "Score for condition xxxx was: $="    # $= is positive
          :0:
          mbox
      }
      #   Wont' work.  $= is getting set back to 0 outside of
      #   the delivering recipe.
      #
      LOG = "Score for condition xxxx was: $="

[david] If you want to save the score of a recipe even if it is zero or negative,

      :0 condition-related flags
      * conditions
      { } # official procmail no-op
      SCORE = $=
      :0A action-related flags
      action_if_positive

If other recipes that clobber the references for the A flag intervene, this will work:

      :0 condition-related flags
      * conditions
      { }                     # official procmail no-op
      SCORE = $=
      ... more stuff ...
      :0 action-related flags
      *$ $SCORE^0
      action_if_positive

11.4 Counting lines in a message (Adding Lines: header) [toc]

Idea by 1995-10-03 David Karr dkarr@nmo.gtegsc.com Adding a "Lines:" header. [david] later corrected 1998-01-02: For one thing, the second condition always counts one too many (the final newline plus the closing putative newline create the extra match); second, after making that correction, an empty body would score zero and leave the variable undefined.
       :0
      * 1^1 .
      * 1^1 ^.*$
      * -1^0
      { }
      lines = $=
      :0 fhw
      * ! ^Lines:
      | $FORMAIL -a "Lines: $lines"

The reason we used it at all was that size conditions worked only on the entire text regardless of H or B or HB flags at the top of the recipe. Nowadays we can do this and get the accurate figure in one condition:

      # leave `B ??' out to measure the entire message
      :0
      * 1^1 B ?? > 1
      { }
      size = $=

If you want to be silly about it (as some of us very often do),

      :0
      * -1^1 B ?? > -1
      { }
      size = $=

gives the same result, and as long as the search area is non-empty, so do these, which are even sillier:

      :0
      * 1^-1 B ?? < 1
      { }
      size = $=
      :0
      * -1^-1 B ?? < -1
      { }
      size = $=

[Karr] This recipe counts bytes in the message, you could use this Content-length replacement, prefer using the next recipe. The first score counts every character, and the second score sums up every line (that is: newlines are added).

      :0 HB        # use B to measure body only
      *    1^1 .
      *    1^1 ^.*$
      {
          textsize = $=
          :0 fhw
          * ! ^Content-length
          | $FORMAIL -a "Content-length: $textsize"
      }

11.5 Determining if body is longer than header [toc]

      :0
      *  1^1 B ?? > 1
      * -1^1 H ?? > 1
      {
          ..body was longer
      }

11.6 Matching last Received header [toc]

Here is nice way to use scores to hit the bottommostreceived header, idean by [david]
      :0
      * $ 1^1 ^Received:.*by$s+\/.*
      action

11.7 How to add Content-Length header [toc]

We use procmail for local delivery, and would like to get it to generate the content-length header, if one doesn't exist. SUN-OS mailtool at least gets confused and merges messages together if there is no message body.

[stephen] All you need to do is: a) Make sure that procmail is started without the -Y flag. b) Either, in your sendmail.cf, insert:

      H?l?Content-Length: 0000000000

Or (slightly less efficient), insert the following recipe in your /etc/procmailrc file and Procmail will take care of any necessary magic.

      :0 hfw
      * !^Content-Length:
      | /usr/bin/formail -a "Content-Length: 0000000000"

11.8 Processing messages shorter than N lines [toc]

Size conditions ignore H and B on the flag line and always work on HB unless another search area is specified on the condition's own line. To test ONLY the body,
      :0       # Note: this is in BYTES
      * $ B ?? < $N
      { whatever }

This syntax would obey a B flag on the flag line:

      :0 B     # Note: this is in LINES
      * -1^1 .
      * -1^1 ^.*$
      * $ $N^0
      { whatever }

11.9 Counting commas with recursive inclurerc [toc]

[david] and I [phil] ended up with the following recursive INCLUDERC to count the comma separated items in any number of headers. This is more generic than the solutions presented recently as it will work whether or not there are multiple To: or Cc: headers, and will correctly (for some values of correctly) handle Resent-* headers.

Here's the last I mail I have from the thread, with one small bugfix applied. In this case the goal was to bounce any message with more than 19 recipients.

      # pm-mycomma.rc -- Count commas in To/Cc fields.
      #
      #   Put the regexp of the lines to examine in REGEXP Also count
      #   Apparently-To: headers, just because they're obnoxious.
      :0
      * ^Resent-(From|Date|To|Cc|Message-Id):
      { REGEXP = "^(Resent-(To|Cc)|Apparently-To):" }
      :0E
      { REGEXP = "^((Apparently-)?To|Cc):" }

      #   Put the string to match against into $HEADERLINES
      :0                                      # H is implicit
      * $ ()\/$REGEXP(.*$)*
      {
        HEADERLINES = $MATCH                  # Set up the initial run.
        #     What's the maximum number of items allowed? This passes,
        #     one more get's torched.
        #
        EXCESS    = -19
        INCLUDERC = $PMSRC/pm-mycomma1.rc     # Let it rip.
       :0 # count was predefined and prejudiced at -19
       * $ $EXCESS^0
       { EXITCODE=77 HOST }
      }

The one optimization that I can still see is to add the following condition to the recursion recipe:

      :0
      * EXCESS ?? ^[1-9]
      * HEADERLINES ?? $ $REGEXP(.*$)*\/$REGEXP(.*$)*
      {
         ...
      }

That'll avoid the recursion if we've already gone positive. Note that this is a Good Thing if just to avoid the copies that the very next condition will cause in coping the matched text back and forth from HEADERLINES to MATCH and back again.

HOWEVER...

With this added condition, it becomes more complicated if you want to get the total count, with no upper limit criteria as is true in the original goal. If you think it unlikely that you'll need to do that, or the extra bother of having to set EXCESS to negative 10 million doesn't bother you, go ahead and add the condition.

      # pm-mycomma1.rc -- Count commas; recursive includerc
      #
      #   Goal: count the number of comma-separated items in
      #   $HEADERLINES in lines that begin with $REGEXP. EXCESS will
      #   start at the negative of the maximum allowed, then count up
      #   towards zero. Clear MATCH, then count the items in the
      #   first match, if any.
      MATCH=
      :0
      * 1^0 HEADERLINES ?? $ $REGEXP\/.*
      * 1^1 MATCH ?? ,
      {
          # Okay, increment EXCESS by $=.
          :0
          * $ $EXCESS^0
          * $ $=^0
          { }
          EXCESS = $=
          #   Now, do we need to recurse? Only if there's still
          #   another match to $REGEXP. If so, we reset $HEADERLINES
          #   to contain only the second match and beyond, and then
          #   we take another ride.
          :0
          * HEADERLINES ?? $ $REGEXP(.*$)*\/$REGEXP(.*$)*
          {
             HEADERLINES = $MATCH
             INCLUDERC   = $_                      # Recurse!
          }
      }

12.0 Formail usage [toc]

12.1 Always use formail's -rt switch [toc]

[faq] formail -r breaks RFC822, so always use formail -rt if you don't know what this means. Perhaps you should always use it anyway.

[david] There is formail -r[t] rank bar graph in the source code of 3.11pre4. It might be easier to follow as a top-to-bottom listing (and again, Tom Zeltwanger appears to be using one of the older versions where From_ was mistakenly overpromoted). These are the rankings in version 3.11pre4:

      formail -r:                     formail -rt:
      Resent-Reply-To:                Resent-Reply-To:
      Resent-Sender:                  Resent-From:
      Resent-From:                    Resent-Sender:
      Return-Receipt-To:              Reply-To:
      Errors-To:                      From:
      Reply-To:                       Sender:
      Sender:                         Return-Receipt-To:
      From_                           Errors-To:
      Return-Path:                    Return-Path:
      Path:                           From_
      From:                           Path:

[Stephane Bortzmeyer bortzmeyer@pasteur.fr] Always use 'formail -rt' and never 'formail -r'. Because such precedence (Sender over From) is an important violation of RFC 822. There is one canonical order, described in the RFC and nothing else should be used, like fuzzy ranking or, worse, reordering. This is a serious problem with formail.

The proper order is:

      Reply-To, else From, else Sender, else <error>

And, how would you deal with resent mail?? Ie: Resent-Reply-To, Resent-From, and Resent-Sender?

It treats Resent-X as X (" Whenever the string "Resent-" begins a field name, the field has the same semantics as a field whose name does not have the prefix. "). So you have to choose an order between them, the RFC does not specify it.

[david] I think that the idea is that formail -r is intended to determine the origi- nation address, not the place to reply; formail -rt is for determining the place to send replies. For addressing a response, yes, -rt will invert the header in a way more in line with the rules; for figuring out the origination point, formail -rzxTo: might be better than formail -rtzxTo:.

And here's an additional problem: formail -rD always uses the formail -r precedences; you can't make it use the -rt precedences and the -D cache- checking function at the same time.

4.4.4. AUTOMATIC USE OF FROM / SENDER / REPLY-TO (RFC 822 excerpt)

For systems which automatically generate address lists for replies to messages, the following recommendations are made:

Sometimes, a recipient may actually wish to communicate with the person that initiated the message transfer. In such cases, it is reasonable to use the "Sender" address.

This recommendation is intended only for automated use of originator-fields and is not intended to suggest that replies may not also be sent to other recipients of messages. It is up to the respective mail-handling programs to decide what additional facilities will be provided.

Examples are provided in Appendix A.

12.2 Using -rt and rewriting the From address [toc]

Sendmail adds the From header which points to your account. But in some cases you may wish to rewrite the From address.

You could also use Reply-To to signify where you want further responds to go, but that doesn't hide your true From address. And there is still MUAs that don't obey Reply-to. Whatever reason you have to rewrite From header, here is the command.

      :0 fhw
      | $FORMAIL -rt -I "From: me@forever-lasting-address.com"

12.3 Formail -rt and Resent-From header [toc]

Here is something that made me scratch my head a lot. Let's examine scenario first which explains how the email travels.
      account --> virtual-address --> Local-address

In this chain I was sending message from my University account to my official work address, the virtual-address delivers the mail to right local domain. There is only one problem with this picture. When I generated response from Local-address with formail -rt, the generated address pointed back to *virtual-address, which pointed back to Local-address of course. A loopback was ready, I never got route travelling to original address account

What was happening here was that the mailserver that handled the virtual-address, didn't forward the message, but instead remailed the message. In this process a set of new headers were generated:

      Resent-From: <virtual-address>
      X-From-Line: <account>
      Received: from <the virtual-address mailserver>
      Resent-Message-Id: 199710151903.WAA28670@virtual-address
      Resent-Date: <date>
      Resent-To: <local-address>
      Received: ...<account domain>
      Message-Id: 199710151904.WAA05050@account-domain
      From: <account-domain>

And now when the formail -rt command was used, it picked up the Resent-From add destination where the message should be returned. Surprising, but according to procmail, 100% correct. Resent-From has higher priority than From.

The Resent-* headers are considered informative , and should never be used when automatically generating a response. The problem here is the middleman, it should not resend a message, but rather forward it. So I put this into my .procmail to handle the broken middleman in our site.

      #   Remove that misleading Resent-From if it was added by our
      #   "middleman"
      #
      :0 fh
      * Resent-From: <our-domain>
      | $FORMAIL -IResent-From:

[edward] adds to this that: As you know, formail -rt is for composing a response to the address from which an e-mail was sent. Let's say you are on vacation and have set up a procmail recipe to autorespond to all e-mail you receive. Furthermore, let's say Joe sends me an e-mail and I re-send it to you. If you wanted to respond to the sender of the e-mail that you received, would you e-mail me or Joe? You better e-mail me because I was the one who sent it to you. Joe may not even know you. Imagine if you did send your response to Joe. It would probably cause him considerable confusion as to why you are sending him e-mail informing him that you are vacation.

formail -rt uses a heuristic algorithm to determine who it should respond to, based on the presence of various headers and their contents. If you look at the formail.c source code, you'll see a graphical representation of this algorithm. It will also explain difference between the results of formail' -r and formail -rt.

Resent-Reply-To has the highest relative importance/reliability of all header fields. Next is Resent-From and Resent-Sender, followed by Reply-To, From, Sender, et al.

12.4 Quoting the message [toc]

Use formail -rtk

12.5 Without quoting the message [toc]

Use formail -rkb or formail -rkt -p '' or formail -rkt -b

12.6 How to include headers and body to the reply message [toc]

[david] ...It does require that the entire head fit into sed's hold space, but it almost always will; exceptions are cases where the sender messed around and added a bunch of uninformative (and usually self-congratulatory) additional headers or when the message got caught in a loop for a while but finally escaped before being bounced for too many hops.
      :0 fhwr
      | sed -e H -e '$ G'
      :0 fhw
      | $FORMAIL -rt ... now generate reply ...

12.7 How to add text to the beginning of message [toc]

      :0 fh
      | cat - ; echo "This text comes after the body"

12.8 How to truncate headers (save filing space) [toc]

[Idea by Rodger Anderson rodger@hpbs2245.boi.hp.com] As a last recipe, if you're tight of space, you could remove extraneous headers. But make sure you want to that, because headers may contain useful information about URLs and other things like mail server addresses. Some people put the info to X-header instead of their signature, so that it's not bother people that read the body text.
      #   Strip header to bare minimum
      #   If this is MIME multipart, then skip recipe
      :0 fhw
      * ! multipart
      |   $FORMAIL -k                                                  \
          -X Date:                                                    \
          -X Subject:                                                 \
          -X Message-Id:                                              \
          -X From                                                     \
          -X To:                                                      \
          -X Cc:                                                      \
          -X Reply-To:                                                \
          -X Mime-Version:                                            \
          -X Content-type:
      :0 :
      mail.default.mbox

[david] comments the final recipe

Another slightly different approach is to kill the headers that take the most of the space. If you're not interested in tracking down the original sender of possible UBE message, then you can remove the Received headers. You may want to fill out the remcondition line to implify only your work or campus messages, and let other messages to have their full headers.

      :0  fhw
      *   possible-condition-to-handle-only-certain-messages
      |   Formail -I Received:

12.9 Adding extra headers from file [toc]

[stephen] Notice that the obvious solution won't do here:
      :0 fhw
      * condition
      | $FORMAIL -rt | cat - $HOME/newHeaders

The problem here is that there will be a newline in the middle, which causes the header to be shortened (procmail determines the new header/body boundary after having processed each filter). Use the following instead.

      :0 fhw
      * condition
      | $FORMAIL -rt -X "" ; cat $HOME/newHeaders ; echo

[david] If $HOME/newHeaders ends in a blank line, you don't need the "; echo". Under some circumstances procmail puts back the blank separating line if it gets lost, but I'm not sure exactly what those are, and you have a SHELLMETAS character in there already (the first semicolon), so a shell is forked anyway.

But this is my favourite way (it assumes that formail -r will never generate a continuation line for From:); if you use it, make sure that the newHeaders file does NOT contain a trailing blank line:

      :0 fhw
      * whatever
      | $FORMAIL -rtn
        :0 Afhw
        | sed "/^From:/r $HOME/newHeaders"

12.10 Extracting all From addresses from mailbox [toc]

The -ns causes formail to split the mailbox and feed each mail separately to next process.
      % cat mailbox | formail -ns formail -xFrom: | sort -u

12.11 Applying procmail recipe on whole mailbox [toc]

      % cat mailbox | formail -ns procmail pm-experiments.rc

12.12 Splitting digest [toc]

[Idea by David Hunt] One interesting idea to handle digests automatically as single messages if that we call procmail recursively. First Call formail to split the mail when headerfields are contained in the body, calling procmail again as the output-program of formail. insertion of X-Loop makes it possible to reuse .procmailrc for the separate messages.
      #   If it looks like more than one mail, send to formail for
      #   splitting, then send back to procmail for sorting again.
      #
      :0 B
      *  ^From [-_+.@a-z0-9]+  (Sun|Mon|Tue|Wed|Thu|Fri|Sat)
      *  ^From:
      *  ^TO
      *$ ! H ?? ^X-Loop: $MY_XLOOP
      | $FORMAIL -A "X-Loop: $MY_XLOOP" -m4s procmail

12.13 Making formail to run series of commands for each mail [toc]

Maybe the heat has melted my brain, but I can't seem to get formail to perform a series of commands on each mail that it's split from a folder. Here's an example of a simple debugging attempt: I've tried parentheses, putting the commands into a shell function, and other flailings too numerous to remember, all to naught.
      % formail -s addr=`formail -XFrom: | formail -r | formail -zx To`;\
          echo "$addr" >>output

It appears that formail doesn't use the shell when executing the command specified when splitting. No SHELLMETAS here. Given that, the secret is to fire up the shell explicitly yourself to do the piping:

      % formail -s sh -c 'formail -XFrom: | formail -rzxTo:' >> output

Note that you only need two formails in the pipe, not three, as the -r flag works correctly when combined with other flags.

12.14 Option -D and cache [toc]

[Bob Weissman b_weissm@kla.com] and [stephen] These files are self-limiting. The number after the -D is the size in bytes above which the older entries will be removed. E.g., my .procmailrc has
      :0 Wh:  .msgid.lock
      |$FORMAIL -Y -D 12288 .msgid.cache

And the file never exceeds 12288 bytes by very much. Though formail indeed exceeds this size by as much as the length of one message-ID, the file size should never grow significantly beyond that, even if used indefinitely. The file is in binary format, each entry terminated by single null byte, and an occasional (significant placeholder) double null

12.15 Option -D and message-id in the body [toc]

Some of my messages contain the original Message-ID in the body of the letter and not the Header. Is there an option for Formail to over come this problem?

[david] This is strictly untested; I don't know where in the body the Message-ID's appear, but if they're at the top of the body, this might help:

      #   If there's a Message-Id: in the head, use that one.
      #
      :0 hW        # H is implicit; brackets enclose caret, space, tab
      * ^Message-Id:.*[^        ]
      | $FORMAIL -D cache_size cache_name
      :0E BbW      # If not but there's one the body, try body.
      * ^Message-Id:.*[^        ]
      | $FORMAIL -D cache_size cache_name

You might want to copy a Message-Id: from the body to the head in any case (if there's none already in the head) just to have it in the right place, so we could do that first and then formail -D will work normally. This form will run formail twice if the Message-Id: header is in the body instead of the head, but it will look for Message-Id: on any line of the body, not just at the top:

      :0 fhw
      * ! H   ?? ^Message-Id:.*$NSPC
      *   B   ?? ^\/Message-Id:.*$NSPC
      | $FORMAIL -A "$MATCH"
      :0 hW
      | $FORMAIL -D cache_size cache_name

12.16 Reducing formail calls (conditionally adding fields) [toc]

Suppose you want add fields to the message when some condition is met:
      :0              # compose initial reply
      | $FORMAIL -rt
      :0
      * condition1
      | $FORMAIL -A "X-Header1: value1"
      :0
      * condition2
      | $FORMAIL -A "X-Header2: value2"

Hm, we have three processes called here, can we minimize the calls? Yes, this is idea from [philip] and [david]. Notice that there is only ONE process needed.

      :0
      * condition1
      { hdr1 = "-AX-Header1:value" }
      :0
      * condition2
      { hdr2 = "-AX-Header2: value" }
      :0 fh
      | $FORMAIL -rt  ${hdr1+"$hdr1"} ${hdr2+"$hdr2"}

And if you want to stack all headers to only one variable, it is a bit of extra work. Below we use short variable names only because of the line space: the calls fit on one line.

The recipe says: if f has previous value, set nl to newline separator, later concat previous contents of f with possible newline and new header field.
      f       # kill variable
      :0
      { nl nl=${f+"$NL"}  f="$f${nl}X-Header1: value" }
      :0
      { nl nl=${f+"$NL"}  f="$f${nl}X-Header2: value" }
      #   If we have something in *f*, call formail
      :0 fh
      * f ?? ^^^^
      | $FORMAIL ${f+-A"$f"}

The above recipe was the most general one, each recipe determined by itself if the f existed previously or not. But if you know that f is already set, you can write simpler recipe:

      # We know f has value before our module
      :0
      { f = "$f${NL}X-Header1: value" }

12.17 Formail -A -a options [toc]

You can't use option -A with -a or -I if the header name is th same. Like below where you try to keep only the last definition of X-1, but the first -A isn't seen when -a is applied.
      formail -A "X-1: 1" -a "X-1: 2"
      -->
      X-1: 1
      X-1: 2

Whereas; separate pipes give you the desired results.

      formail -A "X-1: 1" | formail -a "X-1: 2"
      -->
      X-1: 1
      formail -A "X-1: 1" | formail -I "X-1: 2"
      -->
      X-1: 2

12.18 Formail -e -s options [toc]

[david] I had a file of alternating From: and Date: lines and wanted to convert it into an mbox.
      formail -dem2 -s < input > mailbox

should have done it, right? Nope; formail -s took it all as one message, even with -m1. When I edited in blank lines, the command worked.

My first reaction was that the -e option wasn't working as advertised and that the blank lines were necessary after all.

Then I realized the real problem: there was no interruption in the succession of valid header lines in the input for anything that could look like a body. I could have put something other than blank lines between each pair of headerfields and then -e would have done its job, but as long as every additional line looked like a valid RFC822 headerfield, even if its name was the same as one that had appeared earlier, formail -s assumed that it was still the same message's head.


13.0 Procmail, MIME and HTML [toc]

13.1 Software to deal with mime or html [toc]

See also nearest Perl CPAN module, http://www.perl.org, site and CPAN/modules/by-module/MIME/MIME-tools-4.112.tar.gz

There also exists Unix binary munpack to explode mime message to separate files.

Also see mutt Unix Email agent that could handle HTML mail. (Pointer to Mutt were mentioned previously)

13.2 Killing html mime messages [toc]

[era] Here is simple filter to throw out unwanted html that is send by using mime.
      :0:
      *$ ^Content-Type:$s*multipart/(mixed|alternative);\
         $WSPCL*boundary="?\/[^;"]+
      * $ B ?? ^--$\MATCH\$([-a-z]+:.*)*Content-type:$s*text/html
      junk-html.mbox

Some more examples can be found from section: 'Explaning ^^ and ^'

13.3 Complaining about html messages [toc]

Marek Jedlinski eristic@gryzmak.lodz.pdi.net. This how I respond to html messages. In my no_HTML_please I politely explain why I don't appreciate receiving HTML email, and ask to resend the messahe as plaintext. What happens in the majority of cases is that the sender resends the same message again ("oh, it bounced, let's try again") and I sassume they don't actually read my explanation since they just happily resend the HTML cr*p. It bounces again at which point they give up... Tough luck, I say ;)

BTW, the above recipe is placed after mailing list mail gets sorted. When someone sends HTML mail to a mailing list I read, I just flame them in person

      TXT_NO_HTML = $HOME/no_HTML_please.txt
      :0 BH
      *  ! ^FROM_DAEMON
      *$ ! ^X-Loop: $XLOOP
      *    ^Content.Type.+multipart.alternative
      *    ^Content.Type.+text.html
      {
              LOG = "$LF --TRASH: multi-part HTML $LF"
              :0
              | ($FORMAIL                                             \
                    -rtk                                              \
                    -A "X-Mailer: Procmail Autoreply"                 \
                    -A "X-Loop: $XLOOP" ;                             \
                  cat $TXT_NO_HTML                                    \
                  ) | $SENDMAIL -oi -t
      }

13.4 Getting rid of unwanted mime attachements (html, vcard) [toc]

Microsoft and Netscape MUAs are conquering the PC world and it's likely that you will receive messages from people that use these software. The unfortunate thing is that you receive the message in mime format:
      HEADERS
      --mime-boundary
      plain text
      --mime-boundary
      Some idiotic html (or other type) copy of the text
      --mime-boundary

When you would like to see traditional message in format:

      HEADERS
      plain text

Good news. There is procmail module that addresses this problem. The module can kill any mime attachement and the predefined sets incule typical cases:

The module is called pm-jamime-kill.rc and included in Jaro's pm-code.shar. See the procmail module section how to get the shar file.

13.5 Sending contents of a html page in plain text to someone [toc]

[Timothy J Luoma luomat+procmail@luomat.peak.org] Send an email with the subject: "GetPage: some.url.here/". And it comes back. Kurt Thams thams@thams.com also pointed out that lynx allows file:// protocol and since procmail is running as you, this would have been security risk
      GetFile: ~user/.login

We make the script safe here by forcing "http://$MATCH" and not simply using "$MATCH"

      :0
      *   ^Subject:[  ]GetPage:()\/.*
      * ! ^X-Loop: $MY_XLOOP
      |   ($FORMAIL -rt                                                \
          -I "Precedence: junk"                                       \
          -I "Subject: Requested page: $MATCH"                        \
          -I "X-Loop: $MY_XLOOP" ;                                    \
          lynx -dump "http://$MATCH"                                  \
          )|                                                          \
          $SENDMAIL -oi -t

14.0 Simple recipe examples [toc]

14.1 Saving: MH folders -- numbered messages [toc]

Hm. This is explained in the procmail man pages, but not very well. There is just one or two occasions where the man page tells how to create individual files instead of catenating messages to a folder. Notice the funny /. at the end of folder name
      :0
      * condition
      dir-folder/.

[manual] When delivering to directories (or to MH folders) you don't need to use lockfiles to prevent several concurrently run- ning procmail programs from messing up.

14.2 Saving: to monthly folders [toc]

      # Use any date method mentioned previously to define variables
      # YYYY YY MM DD
      #
      # Archive digests monthly
      #
      :0 c:
      * ^From:.*\/mailing-list-digest@some.net
      {
          # Get the "mailing-list-digest" string, do not use following
          #
          #       MBOX = `echo $MATCH | sed -e 's/@.*//' `
          #
          # Because we really don't need those extra shell processes.
          # Procmail can derive the word 10X more efficiently
          #
          :0
          * MATCH ?? ()\/[^@]+
          { MBOX = $MATCH }
          :0 :
          $YYYY-$MM-$MBOX
      }

14.3 Modifying: Filtering basics [toc]

Pay attention to the cat command position in each recipe.
      :0 fb
      | echo "This is a line of text _before_ the body"; \
         cat -
      :0 fb
      | cat - ; \
        echo "This is a line of text _after_ the body"
      :0 fb               # catenate a canned message before the body
      | cat msg.txt -
      :0 fbi              # replace the body with a canned message
      | cat msg.txt

14.4 Modifying: Squeezing empty lines around message body [toc]

[david] Anything that replaces the body is going to require an outside process, even if it's only /bin/echo. In order to trim empty lines from the beginning of message and from the end of message, you can do thi, If the entire body fits into LINEBUF
      :0 fbBw
      * ^^$*\/.(.|$)*.$
      | echo "$MATCH" # trailing extra newline intended

If your version of cat is BSD-ish,

      # SysV's cat has a different meaning for -s and cannot do this
      :0 fbBw
      * $$$
      | cat -s

otherwise, it can be done with a very simple sed filter:

      :0 fbBw
      * ^^($)|$$$
      | sed /./,/^$/!d

Note that cat -s has slightly different results from the others: if there are any$t empty lines at the top of the body, cat -s will keep one. The echo and sed suggestion will remove all empty lines from the top and, like cat -s, keep one at the bottom.

14.5 Service: Auto answerer to empty messages [toc]

[elijah] Here is piece of code that responds to empty messages.
      :0 B
      * ! ...
      | (echo "From: me@here.com" ;                                   \
        $FORMAIL -r -A"Precedence: junk"                               \
        -A"X-Loop: me@here.com" ;                                     \
        echo "Your blank message was received." ;                     \
        echo "Did you mean to say something?" ;                       \
        echo "" ;                                                     \
        echo "My Signature" ;                                         \
        echo "------" ;                                               \
        echo "this has been an automated response") | $SENDMAIL -t

14.6 Service: Ping responder [toc]

Sometimes I'm on the road and I don't seem to get access to the site where my messages are. The telnet connection fails and standard unix "ping" plays dead for me. "What's happening in that site?" I wonder. Here is a recipe that I have added to all of my accounts. It sends an immediate reply if at least the mailhost is up and gives some status information
      :0
      * ^Subject: ping$
      {
          :0 fh
          | $FORMAIL -rt
          #   Remember, Don't send back anything that would be vital to
          #   attacker. It doesn't matter if the `uptime` or other
          #   scripts fail, the reply is sent anyway.
          :0 c    # Record this ping request
          |   ( cat -;                                                \
                echo `uptime`;                                        \
                echo "$HOST User count: " `who | wc -l`;              \
              ) | $SMAIL
          :0:   # or sink to $DEFAULT
          $PING_SPOOL
      }

14.7 Service: simple vacation with procmail [toc]

Don't forget to look into procmailex(5) man pages which also has vacation example. The ones presented below may not work for you. Here is a very simple vacation recipe. Whenever the file ~/.vac exists, the vacation program is called. Be sure that you have the ~/.vacation.msg file ready too. Remember that vacation does not save you messages; so we need c flag here.
      :0 wc
      * ? test -e $HOME/.vac
      | vacation $LOGNAME

Some people like to raise a flag in .procmailrc instead of creating a file. If you like the variable approach better, here is the equivalent implementation of the above

      VACATION = "yes"    # Comment this when not in vacation
      :0 wc
      * VACATION ?? yes
      | vacation $LOGNAME

[phil] and [era] Since vacation only sends replies -- it never sends the original # messages, one way to do two things with your .forward file. Substitute "abc" with your login name.

      |/usr/ucb/vacation","exec /usr/local/bin/procmail -f- || exit 75 #abc

14.8 Service: vacation code example [toc]

[By Eric Black eric@Mirador.COM] Here is the procmail part
      OFFSITE = "my_guest_login@wherever.I.am"
      #  Forward urgent mail to me at my offsite address; afterward,
      #  continue processing it as normal The procmail pattern match
      #  may be case-insensitive, in which case this rule could be
      #  simplified...
      #
      :0 c
      * ^Subject: .*urgent
      | $SENDMAIL $OFFSITE

      #  Use "vacation" to tell other people I'm not here To enable,
      #  un-comment the next two lines; to disable, comment them out
      #
      #  The -a Identifies another name that can legitimately
      #         appear in the To: line of the mail header instead
      #         of your login name
      #
      :0 wc
      | vacation -a ericb eric

And here the ~/.vacation.msg file

      Subject: I'm out of town for a while
      From: eric (via the vacation program)
      I'm out of town until <return-date>.  Your mail regarding
             "$SUBJECT"
      will be read when I return, or possibly at some unknown
      time before then if I get a chance to check for email.
      If your message must be seen by me before I return,
      you can send it with the word "URGENT" in the subject header.
      Such mail will be automatically forwarded to me so that
      I see it sooner.
      --Eric

14.9 Service: Auto-forwarding [toc]

I have my .procmailrc setup to forward email to another (email only) account. When I am not going to be at the account, I want to turn forwarding off

[Timothy J Luoma luomat@peak.org]

      #   look for the file to tell us whether or not to forward mail
      #   if the file exists, forward the mail
      #   or not
      :0 c
      * ? test -r $HOME/.forwardmail
      ! me@elsewhere.com
      #   if a message arrives from the other account
      #   with the Subject 'forward-off' then remove the
      #   file, effectively turning off forwarding
      :0 hi
      * ^From:.*me@elsewhere\.com
      * ^Subject: forward-off
      |/bin/rm -f $HOME/.forwardmail
      #   if a message arrives from the other account
      #   with the Subject 'forward-on' then remove the
      #   file, effectively turning off forwarding on
      :0 hi
      * ^From:.*me@elsewhere\.com
      * ^Subject: forward-on

14.10 Service: forward only specific messages [toc]

Here is piece of code that triggers forwarding according to addresses. If you have lot of these kind of forwarding, you should use simple awk database which you would grep.
      #   By Jim Hribnak hribnak@nucleus.com
      #   info@domain1.com goes to joe@domain1.com
      #   info@domain2.com foes to fred@domain2.com
      #
      :0
      * ^TO_info@domain1.com\>
              {  FORWARDTO="$FORWARDTO joe@domain1.com"  }
      :0
      * ^TO_info@domain2.com\>
              {  FORWARDTO="$FORWARDTO fred@domain2.com"  }
      :0 fw
      * FORWARDTO ?? @
      * ! X-Loop:.*your@address.here
      | $FORMAIL -A "X-Loop: your@address.here"
        :0 a
        ! $FORWARDTO

14.11 Service: Making digests [toc]

      # By jimo@eskimo.com
      # Add this message to the digest accumulator
      :0 c:
      | $FORMAIL -k -X From: -X Message-Id -X Date -X Subject >> $DIGEST
      #Check size of digest, and send it off if it's big enough
      :0
      * $       -$DIGSIZE     ^0
      * $ `wc -l <$DIGEST`    ^0
      | nice -10 send-digest $DIGEST

14.12 Kill: simple killfile recipe with procmail [toc]

Killfiles are widely used with newsreaders to delete uninteresting posts when you enter a newsgroup. A killfile usually contains one single entry per line to match the message content and this can be easily done with procmail. Remember however that for every message procmail forks a process, so before you apply the killfile rules to the messages, be sure your recipes are in this order: the killfile rules are applied only to unknown messages
      SINK MAILING-LISTS
      SINK ANNOUNCEMENTS
      SINK WORK MESSAGES
      OTHER DELIVERIES
      apply killfile rules and UBE recipes to the rest

Recipe will drop the message (i.e. consider it 'delivered') if one of its headers matches a pattern in killfile.

      :0 hW:  $HOME/.killfile.lock
      | egrep -i -f $HOME/.killfile

The reason why there is explicit lockfile is that you must be able to update the killfile while your procmail is running. An example edit script is presented below.

      #!/bin/sh
      # program: killfile.sh
      #
      file=$HOME/.killfile
      lock=$file.lock
      cp $file $file.tmp
      emacs -q $file          # or use whatever you prefer: vi, pico
      lockfile $lock
      mv $file.tmp $file
      rm -f $lock

14.13 Kill: duplicate messages [toc]

[Lars Kellogg-Stedman lars@bu.edu] Put this as a first entry in your .procmailrc and you won't see any duplicates as long as the 8K cache doesn't get full. The duplicates folder is cleaned out weekly via a cron job. While it may be tempting to simply sink duplicates to /dev/null, I have come across broken mail clients the stick the same value in the Message-id header of all outgoing mail.
      # IF  the message has a message-id header
      # AND formail -D is successful (exit status=0)
      # THEN
      #   log a message to the procmail log
      #   sink the message
      SUBJECT         = ${SUBJECT:-`$FORMAIL -xFrom:`}
      MID_CACHE_LEN   = 8192
      MID_CACHE_FILE  = $PMSRC/msgid.cache
      MID_CACHE_LOCK  = $PMSRC/msgid.cache.lock
      LOCKFILE = $MID_CACHE_LOCK
      :0
      *  ^message-id:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE
      {
          LOG="dupecheck: discarded message, $SUBJECT $NL"
          :0     # Store duplicates here or to /dev/null
          duplicate.mbox
      }
      LOCKFILE

And here is a bit simpler recipe, a slightly modified version from the [manual]. Procmail notices formail's success, considers the message delivered and does not stop processing the rcfile due to c flag, which let's a message to fall into safety copy inbox.

      :0 hWc: $PMSRC/pm-msgid.cache.lock
      *  ^Message-id:
      | $FORMAIL -D 8192 $PMSRC/pm-msgid.cache
        :0 a:
        duplicate.mbox

14.14 Kill: spam filter with simple recipes [toc]

[Ed McGuire emcguire@i2.com] Seeing several junk mail filters posted recently, varying from the simple to the complex, I thought I would also share my own. I junk whatever comes from my ISP but is not addressed to my domain or to one of the mailing lists I subscribe to.
      #   1.  mail to my domain
      #   2.  NOT addressed to me directly
      #   3.  NOT coming from mailing lists I'm subscribed to.
      #
      0:
      * ^(received):.*psi\.com
      * ! ^((apparently-)?to|cc):.*(i2|intellection)\.com
      * ! ^(to|cc):.*(pdp-?8-lovers|procmail|sunshine|info-pdp11)
      junk-ube.mbox

[Gordon Matzigkeit gord@m-tech.ab.ca] I have just discovered an effective rule for separating SPAM from the rest of my e-mail. Just substitute your username for gord in the line below

      # Anything which is not addressed to me is probably SPAM.
      :0:
      * !^TO.*\<gord\>
      junk-ube.mbox

This only works because I handle all mailing list addresses above that point in my .procmailrc (i.e. all traffic that arrives from mailing lists that I am subscribed to goes into other folders). Most SPAMmers seem to do it nowadays by sending mail via mailing lists, rather than creating huge To lists of users

Many times sysadm install a list of know addresses that send spam and then they check the incoming mail against the "black list". Keep in mind that that some fgrep implementations have a problem with the -w word switch. Note that the above recipe scans the FULL HEADER, so use it with some caution, i.e., be careful what you add to your list of spam domains.

      # by [phil]; egrep would do here too, if it is posix
      # compliant, it may have -f switch that makes it behave
      # like fgrep.
      #
      # Note: option -F would make [ef]grep to search fixed string
      #       instead of regexps.
      #
      BLOCK_FILE  = $HOME/Mail/DeniedNames.lst
      UBE_MBOX    = $HOME/Mail/junk-ube.mbox

      # To filter out the Subject lines, so that emails sent
      # with the subject "Have you received a message from
      # blah-blah@spam" don't get filtered.
      # [era] suggested we use formail
      #
      # Edsel Adap edsel.adap@Canada.Sun.COM agrees there is a
      # likely bug in Solaris 2.5.1 "/usr/bin/fgrep -i" and
      # suggested the use of /usr/xpg4/bin/fgrep instead.
      #
      # edsel.adap@canada.sun.com Sun Microsystems Developer Support
      # Files in /usr/xpg4 are available via the SUNWxcu4 package,
      # which is part of the user, developer, all, or Xall Solaris
      # clusters.
      #
      # Solaris 2.4 doesn't have /usr/xpg4/bin/fgre :-(, you
      # must use  `tr A-Z a-z' before piping the message to fgrep.
      #
      :0 hw:
      *$ ? $FORMAIL -ISubject: |fgrep -i -f $BLOCK_FILE
      $UBE_MBOX

The file DeniedNames.lst is simply a list of addresses

      82338201@compuserve.com
      Dwnliner@ix.netcom.com
      Emerald@earthstar.com
      FreeWay@dm1.com

14.15 Kill: (un)subscribe messages [toc]

I'm getting tired of those pesky (un)subscribe messages that certain "other" mailing lists seem to pass through to the list at large instead of capturing them at the list server, like SmartList does.

[Adam Shostack adam@bwh.harvard.edu] The following do help, although they're often too broad. (I use a .safe rule to cover those cases) The < 1000 is a useful hueristic. It's rare that unsubscribe messages are long.

      :0
      * (Delete|u*n*Sub(s| )*| add | leave | help )
      * < 1000
      /dev/null

[Rodger Anderson rodger@hpbs2245.boi.hp.com] I've been working on a recipe to filter out those pesky s*bscribe and uns*bscribe messages from mailing lists, and I'm posting what I have so far. As an aside, it also filters out very short messages, which I've found are usually some sort message meant for list owner/request address.

I give heavy weight to Subjects starting with (un)?s*bscribe, with also pretty heavy weight to Subjects containing either of those words. I then give heavy weight to the body of messages starting with those words, and a lighter weight to lines starting with them. Then multiple occurrences get some weight too, up to a point. Then I count the words in the message against all that.

      :0 B
      *  1^0
      *  30^0 H ?? ^Subject: +(un)?subscribe\>
      *  20^0 H ?? ^Subject:.*\<(un)?subscribe\>
      *$ 20^0   ^^$SPCL*(un)?subscribe\>
      *$ 10^0    ^$SPC*(un)?subscribe\>
      *  8^.4   \\<(un)?subscribe\>
      * -.4^1  \\<$a+\>
      junk.mbox

[Adam Shostack adam@bwh.harvard.edu] How about looking for sub & unsub, as well as a perennial misspelling 'unsuscribe me'? I also find filtering on add, leave and help to be useful. This may well be the only word on the line. I think it has to do with broken list management packages.

      | :0B
      | *  1^0
      | * 30^0 H ?? ^Subject: +(un)?subscribe\>
      * 20^0 H ?? ^Subject: +(un)?sub?(scribe)?\>
      (The B is often missing, as is the word fragment 'scribe')
      | * 20^0 H ?? ^Subject:.*\<(un)?subscribe\>
      * 20^0 H ?? ^Subject: +(add|leave|help)$
        # fewer points if more words
      * 15^0 H ?? ^Subject: +(add|leave|help)

14.16 Time: Once a day cron-like job [toc]

[Bill Moseley moseley@netcom.com] If you want to do something only once a day, they you have to store the date somewhere and check against that stored date.
      YYMMDD_FILE = $HOME/.yymmdd
      YYMMDD      = $YY-$MM-$DD
      YYMMDD_PREV = `cat $YYMMDD_FILE`
      #   If different date, then enter this block
      #
      :0 wc:  $YYMMDD_FILE.lock
      *$ YYMMDD ?? ! ^^$\YYMMDD_PREV^^
      {
          #   Update timestamp
          dummy = `echo $YYMMDD > $YYMMDD_FILE`
          ...do the cron jobs..
      }

14.17 Time: Running a recipe at a given time [toc]

If I put a program to my recipes, it will be executed every time message arrives. That's a problem, and I'm not allowed to use cron in this account. I'm looking for some sort of condition to check the current time and if its outside of the hours 11pm and 7am then execute the action.

[david] How do your From_ lines look? If they're the traditional kind that sendmail and smail add, they include the local time on your system at receipt. So include a check that the hour is between 07 and 22 inclusive, like this:

      :0 c
      *  ^From .*some-address.* (0[789]|1.|2[012]):[0-5][0-9]:
      |  command

I included the minutes and the colon that separates the minutes from the seconds so that the expression for testing the 07-22 range can match only on the hour.

14.18 Time: Triggering email and using cron [toc]

[david] Put something like the following entries in your personal crontab for your userid (and not knowing if you particular cron "cd's" to your home directory first):
      0 23 * * *        touch $HOME/.mail.relay.on
      0 7 * * * rm -f $HOME/.mail.relay.on

And if your cron doesn't know the HOME variable (that'd be an exception)

      0 23 * * *  /bin/csh -c 'touch ~LOGNAME/.mail.relay.on'
      0 7 * * *   /bin/csh -c 'rm -f ~LOGNAME/.mail.relay.on'

Then, in your .procmailrc do:

      :0 c
      *  ^From.*some-address
      *$  test -f $HOME/.mail.relay.on
      | command

the script will run_my_program only if both the subject matches and the file test succeeds. The file test will succeed only between 11pm and 7am.

In all honesty, if system gives usable From_ lines, I like following suggestion better. I use it all the time to turn blocks of procmail code on and off at given times or dates, and it works likes a charm. It uses many fewer processes and is less likely to get the status wrong if for any reason one of the cron jobs fails to run or doesn't do its job.

This pages only at day time

      :0 c
      * ^From .*some-address.* (0[789]|1.|2[012]):[0-5][0-9]:
      | command

This pages at night

      :0 c
      * ^From .*some-address.* (0[0-6]|23):[0-5][0-9]:
      | command

14.19 Archiving: according to TO [toc]

The following code will save the message to folders list.foo, list.bar, list.procmail when the name is in the TO address. Below you see the standard version
      :0:
      * ^TOprocmail
      list.procmail
      [and so on...]

Here is generalised version

      #   By dattier@wwa.com (David W. Tamkin)
      #   cases desired for foldernames
      #
      LISTS = "(foo|bar|procmail)"
      :0:
      * $ ^TO_\/$LISTS
      * $ LISTS ?? ()\/$\MATCH
      list.$MATCH

14.20 Archiving: Detecting mailing list mail [toc]

[phil] For most mailing lists, a more accurate way to determine whether it came from the list is to examine the Return-Path:, From_ or Resent-From: header. This catch messages from the list, regardless of whether they were To: the list, Cc: the list, or even Bcc: the list, something which doesn't show in the message at all.

For instance, I refile message from the procmail mailing list using the following recipe:

      :0
      * ^Return-Path: +<procmail-request@informatik
      ~/Lists/procmail/.

There's one tricky thing to note: if someone sends a message to both me and the list (say, he or she is responding to a message I sent to the list), then the copy that got to me through the list will end up in my procmail folder, while the copy that went directly won't. I like this behaviour, but some people, possibly yourself, may prefer it if both messages end up refiled. If so, your best bet is to combine the above with matching against the To: and Cc: headers via the ^TO_ token:

      :0
      * ^Return-Path: +<procmail-request@informatik|\
      ^TO_procmail@informatik
      ~/Lists/procmail/.
(If you have a version of procmail before 3.11pre4, then you'll need to use "^TOprocmail" instead of "^TO_procmail".). If you're subscribed to many mailing lists, here is one general recipe

Notice: you don't want to include < in the recipe like: ^TO_\<\/$LISTS because The ^TO_ token contains something similar to \< but better, so that the \< can only cause problems. A trailing \> is not a bad idea, though because it's not a zero-width assertion but rather an actual character class, you have to strip it from the match

      LISTS  = "(foo-list|bar-list)"
      #   1) to get the match
      #   2) rematch sans the trailing \>
      #   3) Note: preserves capitalisation of the string
      #
      :0
      * $ ^TO_\/$LISTS\>
      * $ MATCH ?? \/$LISTS
      * $ LISTS ?? ()\/$\MATCH
      {
          M = $MATCH
          <action>
      }
[Era] gives this sample example to describe what happens above:
      VAR =  "MOO"
      what = "moo|bar|baz"
      #   Search what from VAR
      :0
      * $ VAR ?? ()\/$what
      {
          #  Now; what is was that really matched, there were several
          #  choices: mood,bar,bar
          #  Beware: $MATCH must not contain regexp characters
          #
          :0
          * $ what ?? ()\/$MATCH
          { dummy }
          # Fine, New MATCH contains moo
      }

[Peter S Galbraith galbraith@mixing.qc.dfo.ca] I have used this in the past (by simply looking at the spool file and seeing the From_ line of the message):

      :0
      * ^From debian
      list.debian.mbox
      :0
      * ^From procmail
      list.procmail.mbox

Now, I collect specific high-volume mailing lists (like Debian) into their own spool files like above, and let other recipes catch all other mailing lists (like procmail and fvwm) into a single spool file with later rules:

      #   Majordomo lists
      :0:
      * ^Sender: owner-\/[-a-zA-Z0-9_.]*
      list.rest.mbox
      #   SmartList lists
      :0:
      * ^X-Mailing-List: <\/[-a-zA-Z0-9_.]*
      list.rest.mbox

So Debian mailing list mail goes to Debian, procmail and fvwm mail go to maillists and mail addressed to me yet CC'ed to a list go to my main spool file.

14.21 Decoding: Uudecode [toc]

[phil] here is piece of code to do uudecode match when certain condition is matched. The magic string here is "begin ...file", the body is then fed to my_uudecode_program whatever it does to it.
      :0 b
      *      ^From:.*someone@somewhere\.com
      *      ^Subject: Subject
      * B ?? ^begin 644 file.tar.gz
      | my_uudecode_program

14.22 Decoding: MIME [toc]

      #   by Peter Galbraith galbraith@mixing.qc.dfo.ca
      #   MIME filtering of accented characters and split lines.
      #
      :0
      * ^Content-Type: *text/plain
      {
        :0 fbw
        * ^Content-Transfer-Encoding: *quoted-printable
        | mimencode -u -q
          :0 Afhw
          | $FORMAIL -I "Content-Transfer-Encoding: 8bit"
        :0 fbw
        * ^Content-Transfer-Encoding: *base64
        | mimencode -u -b
          :0 Afhw
          | $FORMAIL -I "Content-Transfer-Encoding: 8bit"
      }

      #   1995-10-18 Tim Pickett tbp@cs.monash.edu.au
      #
      #       Decode MIME quoted-printable Content-Transfer-Encoding
      #
      #   Conditions
      #
      #       Mail has a MIME-Version header with a number in it.
      #       Header saying "Content-Transfer-Encoding: quoted-printable"
      #       exists
      :0
      *$ ^MIME-Version:$s*$d*(\.$d*)
      *$ ^Content-Transfer-Encoding:$s*quoted-printable
      {
        :0 fhw     # Remove header
        | $FORMAIL -I"Content-Transfer-Encoding:"
        :0 fbw             # Decode the body.
        | mmencode -u -q
      }

14.23 How to send commands in the message's body [toc]

      :0 b
      * ^Subject: ARCHIVE
      | sed -e '/$s*[^a-zA-Z]/,$ d' | sh

14.24 Matching two words on a line, but not one [toc]

How does one write a recipe that will do this: Put mail in mailbox which has a line with two string (one and two) like:
          one     two

but save mail in error-folder if the line as only the first string like: one (string two is missing)

[phil] I presume these lines would be located in the body of the message, and that by "space between one and two" you mean "whitespace between one and two". If those assumptions are wrong then you'll need to tweak the following recipes:

      # The 'B' tells procmail to look in the body instead of the header.
      # The second colon tells procmail to lock the mailbox with a
      # locallockfile -- if mailbox is a directory then you don't need
      # it. The brackets in the condition contain a space and a tab.
      #
      :0 B:
      *$ one$s*two
      default.mbox
      :0 B:
      * one
      error.mbox

Now, the above will match even if "one" or "two" is part of another word (at the end in the case of "one" and at the beginning in the case of "two"). If you don't want that then you'll need to change the recipes to read:

      :0 B:
      *$ ()\<one$s*two\>
      default.mbox
      :0 B:
      * ()\<one\>
      error.mbox

14.25 How to define personal XX macros? [toc]

By macro, I'm referring to the procmail's FROM_DAEMON, TO and TO_ that you can use in matches. Here is one way to make one's own macro
      #   By [alan] Define HEADERS to include those headers you care
      #   about. Pick one of the definitions below (and remove or
      #   comment out the others).
      #
      #   1. use only To:
      #   2. use either To: or Cc:
      #   3. To:, Cc:, or Apparently-To:
      #
      HEADERS='^To:(.*\<)?'
      HEADERS='^(To|Cc):(.*\<)?'
      HEADERS='^((Apparently-)?To|Cc):(.*\<)?'
      # Examine headers, create a subject tag if we recognize a list
      :0
      *$ ${HEADERS}address@match.it
      address.mbx

14.26 How to change subject by body match [toc]

Suppose you to change the email's subject when there is a match in the body. The desired outcome would be this:
      From: foo@this.is
      Subject: Fault: NNNN in program block YYY    << changed
      Fault: NNNN in program block YYY

Here is the answer

      :0 fhw
      *       ^Subject: NOK case report
      *$ B ?? ^$s*\/Fault: [0-9a-f]+ in program block.*
      | $FORMAIL -I "Subject: $MATCH"

14.27 How to change Subject according to some other header [toc]

Suppose you want to change the subject when mail comes to some particular address; or when some other header field. Here is one way to do it, we suppose that mail comes to various internal mail addresses. See the HEADERS macro in previous section.
      # By [alan]
      # Examine headers, create a subject tag if we recognize a list
      :0
      * $ ${HEADERS}info@foo.com
      { TAG = "info" }
      :0E
      * $ ${HEADERS}check@foo.com
      { TAG = "check" }
      # ...and so on...
      # now, if TAG is set, insert it into the subject
      #
      MATCH
      :0 fhw
      * TAG ?? [^ ]
      * ^Subject: *\/[^ ].*
      | $FORMAIL -I "Subject: $TAG - ${MATCH:-<no subject>}"

Or you could use the command line arguments, add following line to your .forward. (alias file syntax)

      foo: "|/usr/local/bin/procmail -m /usr/local/etc/pm-tagit.rc foo"

Then in tagit.rc you would instead say:

      ARG = $1
      :0
      * ARG ?? ^^foo^^
      { TAG = "foo@go" }
      :0
      * ARG ?? ^^somethingelse^^
      { TAG = "somethingelse@go" }

This method will work even if someone Bcc:s a message to foo@some.com.

14.28 How to call program with parameters [toc]

...now, suppose I want to call program with parameter $FOUND, and get the result back in RESULT, how do I do it ?

The stdout of myprogram will be captured at stored in the variable RESULT. Also consider what should happen if there are spaces or tabs in the value of $FOUND. Perhaps it should be better off enclosed with quoted.

      #   Make sure FOUND is not empty before passed to program
      :0
      * ! FOUND ?? ^^^^
      { RESULT = `program "$FOUND"` }

15.0 Miscellaneous recipes [toc]

15.1 Sending two files in a message [toc]

If you plan to send multiple files in a message, be sure that every file has extra blank line at the end so that they can be cat d together. Instead of doing
      (cat THIS; echo " "; cat THAT ) | $SENDMAIL -t

You do

      (cat THIS THAT ) | $SENDMAIL -t

But sometimes you don't have control over the files, then you can do this to make sure there is blank line. Notice, only two processes used compared to first choice.

      (echo '' | cat THIS - THAT ) | $SENDMAIL -t

[David] And an sed expert would do it this way

      (sed -e '$ !b' -e '/./G' -e "r THIS" THAT ) | $SENDMAIL -t

Now remember that everywhere except the last line, we've skipped ahead, so the rest of the code will be executed only for the last line of the input.

This side of sed comes out only after sed has had a few drinks...

15.2 Excessive quoting of message [toc]

[25 Nov 1997 buck@Compact.COM] I administer a LISTSERV mailing list and our host has asked us to reduce excess quoting of previously posted material. ...Subject: asking if this was excessive quoting. With the weights below, this extra copy will activate at 66% quoted lines of all body lines.

[era] I would definitely tolerate 75% quotes. And in the end, you will of course always have to face the kinds of people who would rather change their quoting style to evade such constraints than quote less. An idealized quote parser should perhaps realize that a non-blank prefix that recurs on a lot of lines is probably a customized quote string. This will preserve the correspondent's original subject (with a Re: added if it didn't already have one) and thus the template text should indicate the nature of the problem.

I'm not sure what would be appropriate to generate behaviour more like I suggest below, any takers? Perhaps no score at all for empty lines, neutralize .signatures (hope sender obeys "-- " convention) and add 10^0.5 for each quoted line and dish out -15^0.3 for nonquoted? (I haven't really explored this -- could be completely up the creek.) [Also, perhaps long runs of quoted material should be penalized harder than quoted snippet -- reply text -- quoted snippet -- reply text alternations?]

      COPY_ADDRESS = "listAdm@foo.com"
      :0
      * ^Sender: <mailing list tag>
      {
          # - quoted lines
          # - non-blank, non-quoted lines
          # - completely blank lines
          :0B
          *  10^1 ^[      ]*>
          * -15^1 ^[      ]*[^    >]
          * -15^1 ^[      ]*$
          {
              # You don't need to repeat the original condition here
              # You also don't really need to extract SENDER
              # Generate a reply with appropriate headers and the
              # body quoted
              :0 fhw
              | $FORMAIL -rtk -A "Bcc: $COPY_ADDRESS"
              # Now "replace" the body with template text + body (In
              # other words, add the template before the quoted body)
              :0 fbw
              | cat $HOME/template.txt -
              # Now send it off to recipients mentioned in generated
              # header
              ! -t
          }
          # Wasn't excessively quoted; save to Mail/directory/
          :0
          $HOME/Mail/directory/.

15.3 Sending message to pager in chunks [toc]

I have a 200 character limit on my pager. But I have wordy contacts who go over that limit. What I would like to do is have a recipe split up messages addressed to my pager into 200 character (max) messages.

[era] This stuff about forwarding to pagers is a recurring topic on this list. I've tried to find a good summary of all the issues but there always seems to be some tiny twist to what people would like to have implemented. As a general comment for future generations, the Procmail part is usually trivial and the problem reduces to writing a good program (shell script or otherwise) for formatting the text precisely the way you want it, and spitting it out in suitable chunks.

Here's something to split up the body of the message into smaller chunks and do a shell script on each chunk. The -s option to fold says to only wrap lines on whitespace if possible

      #   Create a duplicate of the message to forward to the pager.
      #   This will be reformatted and have most headers stripped off.
      :0 c
      {
          # Construct header with only From: and Subject: retained
          HEADER = `$FORMAIL -XFrom: -XSubject:`
          #   Reformat body as 200-character lines and send each
          #   as a separate message with the preconstructed minimal
          #   header
          :0bw
          |   tr '\012' ' ' | fold -s -w 200 | while read line; do
              echo -e "$HEADER\n\n$line" | \
              $SENDMAIL $SENDMAILFLAGS pageraddress@wherever.com ; done
      }

If your version of echo doesn't understand \n to mean newline (and/or the -e option to enable this escape processing), you need to tweak this. (You might need to anyway -- this is mostly untested. In my limited testing, I found the messages would arrive in more or less random order. Inserting pauses in the script should help to some extent, but could lead to other problems and is not an ideal solution anyhow.)

I don't know if the header trimming is required; some pager gateways appear to count the headers as part of the message, while others don't. Again, for future generations, details like this are relevant to include when you ask about how to do this.

15.4 Playing particular sound when message arrives [toc]

[Peter S Galbraith galbraith@mixing.qc.dfo.ca] Here is the command in shell to produce the sound:
      % cat anyfile | /usr/X11R6/bin/auplay /usr/lib/exmh/drip.au

However, it won't work directly in the recipe

      procmail: Executing "/usr/X11R6/bin/auplay /usr/lib/exmh/drip.au"
      Can't connect to audio server

Strange. The command works from the shell if I su to user mail. Anyway, I got it to work by fully specifying the audio server (which is my workstation, where I receive mail)

      AU      = /usr/X11R6/bin/auplay
      TUNE    = /usr/lib/exmh/drip.au
      :0 hwic
      * ^From:.*foo@bar.com
      | cat > /dev/null; $AU -audio tcp/mixing:8000 $TUNE

15.5 Combining multiple Original-Cc and Original-To headers [toc]

How can I use procmail/formail to combine the information in these headers into their CORRESPONDING header MINUS the Original-* Note that I can have multiple Original-Cc: headers and I want all the recipients combined into one Cc: header.
      #   1998-01 by [david]
      #   initialize as unset
      ORIG_TO ORIG_CC
      #   The -c option to formail takes care of headers continued onto
      #   indented lines; the pipe to tr takes care of multiple
      #   Original-To: headers by linking their contents with commas.
      :0
      * ^Original-To:.*[^   ]
      { ORIG_TO = `$FORMAIL -zcxOriginal-To: | tr \\12 ,` }
      #   Drop trailing comma from tr:
      :0 A
      * ORIG_TO ?? ,^^
      * ORIG_TO ?? ^^\/.*[^,]
      { ORIG_TO = $MATCH }
      #   Likewise for Original-Cc: lines:
      :0
      * ^Original-Cc:.*[^   ]
      { ORIG_CC = `$FORMAIL -zcxOriginal-Cc: | tr \\12 ,` }
      :0 A
      * ORIG_CC ?? ,^^
      * ORIG_CC ?? ^^\/.*[^,]
      { ORIG_CC = $MATCH }
      #   Now, let's install the changes if needed:
      #   with -A instead of -I or -i it should
      #   not clobber existing To: or Cc: information.
      #   -A : Append a custom headerfield onto the header in any case.
      :0
      * ORIG_TO ?? ^^^^
      * ORIG_CC ?? ^^^^
      { }
      :0 Efwh
      | $FORMAIL                                                       \
        ${ORIG_TO:+-A "To: $ORIG_TO"}                                 \
        ${ORIG_CC:+-A "Cc: $ORIG_CC"}

15.6 Forwarding sensitive messages in encrypted format [toc]

      #   by [alan]
      #   See if addressed *directly* to me, and ..
      #   ..has not already been forwarded
      KEY             = "TheMagic"
      FORWARD_EMAIL   = "foo@bar.com"
      :0
      * $   ^To:.*$LOGNAME(@|[^0-9a-z]|$)
      * $ ! ^X-Loop: $MY_XLOOP
      {
          # now let's encrypt the body using mimencode
          :0 fb
          |   echo "MIME-Version: 1.0" ;                              \
              echo "Content-Type: application/crypt" ;                \
              echo "Content-transfer-encoding: base64" ;              \
              echo "" ;                                               \
              crypt $KEY | mimencode -b
          #   Now let's prepare the headers for forwarding the mail,
          #   and mark it so we don't loop
          :0 fh
          | $FORMAIL   -I"Resent-To: $FORWARD_EMAIL" -I"X-Loop: $MY_XLOOP"
          :0
          ! $FORWARD_EMAIL
      }

16.0 Procmail and PGP [toc]

16.1 Decrypt pgp messages automatically [toc]

Warning: if you use remailers or anonymous services, you must use different passwords and different user id's to decrypt incoming messages. If you just receive messages encrypted with one key, then you this may be usefull to you. However, it is genrally considered a huge security risk to keep your password carved into your .procmailrc.
      :0 fb
      * B ?? PGP ENCRYPTED MESSAGE
      | pgp -z "your pass phrase" -f +batch 2>&1

16.2 Getkeys from keyserver [toc]

      # by Adam Shostack adam@bwh.harvard.edu 1996-02
      #
      # This first ruleset protects me from mailbombs from an automated
      # service that I often send incorrect commands to, generating 5mb
      # of reply. It also sorts based on success of the command.
      #
      # swissnet.ai.mit.edu is fast keyserver
      :0
      * From bal@swissnet.ai.mit.edu
      {
         :0 h
          * >10000
          /dev/null
          :0 h
          *^Subject:.*no keys match
          /dev/null
         :0E
         | pgp +batchmode -fka
      }

16.3 Auto grab incoming pgp keys [toc]

      #  [Opher Kahn kahn@dg-rtp.dg.com] This first ruleset protects
      #  me from mailbombs from an automated service that I often send
      #  incorrect commands to, generating 5mb of reply. It also sorts
      #  based on success of the command.
      #
      #  swissnet.ai.mit.edu is PGP key server
      :0
      * From bal@swissnet.ai.mit.edu
      {
         :0 h
         * >10000
         /dev/null
         :0 h
         *^Subject:.*no keys match
         /dev/null
         :0E
         | pgp +batchmode -fka
      }
      #  auto key retrieval
      #
      #  I have an elm alias, pgp, points to a keyserver The logfile
      #  gets unset briefly to keep the elm lines out of my logfile.
      :0 W
      * B   ?? -----BEGIN PGP
      * H ! ?? ^FROM_DAEMON
      { KEYID = `/usr3/adam/bin/sender_unknown` }
      LOGFILE=
      #   #todo: We should get rid of the 'elm' dependency here.
      #   #todo: correct this sometime... [jari]
      #
      #
      :0 ahc
      * ! ^X-Loop: Adams autokey retrieval.
      | $FORMAIL -a"X-Loop: Adams autokey" | elm -s"mget $KEYID" pgp

      #!/bin/sh
      #
      #   Script: sender_unknown
      #
      #   unknown returns a keyid, exits 1 if the key is known $output
      #  is to get the exit status. Otherwise, this would be a one
      #  liner.
      #
      OUTPUT=`pgp -f +VERBOSE=0 +batchmode  -o /dev/null`
      echo $OUTPUT | egrep -s 'not found in file'
      EV=$?
      if [ $EV -eq 0 ]; then
              echo $OUTPUT | awk '{print $6}'
      fi
      exit $EV
      #
      # end of sender_unknown

17.0 Includerc usage [toc]

17.1 Using multiple includerc files [toc]

Do INCLUDERC statements function as a kind of "call" which returns control to the "original" rc file if processing falls off the end of the included rc file? Or if processing falls off the end, does mail then get delivered to $DEFAULT and processing stop? Suppose I have these commands
      INCLUDERC = $PMSRC/pm-a.rc
      INCLUDERC = $PMSRC/pm-b.rc
      INCLUDERC = $PMSRC/pm-c.rc

Yes, the control is returned to the original file where the includerc was called from. And No, mail does not get delivered in the $DEFAULT because the includerc just ends: processing continues until there is no more statements in the top level.

Includerc is nothing more that a sliced top level recipe.

17.2 You can do includerc conditionally [toc]

One interesting way to prevent false hits when filtering UBE is to try to see if the message comes from some valid destination first. If it comes, then it shouldn't be run through UBE filter, because it may filter valid messages out. No ube filter is completely bullet proof.

Here is an example where the UBE detection is put into use only when the message comes from somewhere that I don't know beforehand (or I have just forgot to tweak my .procmailrc)

      :0                      # Idea by Bill Moseley
      * ! ^TO_me@here.is
      * ! (procmail|list-a|list-b)
      {
          # Could be UBE or I might be on a unknown distribution list.
          INCLUDERC = $PMSRC/pm-ubecheck.rc
      }

[dan] That would work; common practice, however, is to put recipes for filing mail from lists (and, per Bill's preferences, anything mentioning procmail in the head gets treated the same as mail from this list) first; then the only remaining condition to consider there would be unexpected blind car- bons: * ! ^TO_moseley. This method is good if you get much more spam than legitimate mail (including mail from list subscriptions as legitimate) and you want procmail to deal with spam right away. I belong to several very active mailing lists, so I actually receive more pieces of legitimate mail than pieces of spam.

One way to get the best of both worlds is this:

      * $ ! ()\/(^TO_$LOGNAME|procmail|list-(ABC|123|XYZ))

because then, if the regexp matches (and thus the negated condition fails and you don't detour into $PMSRC/checkspam.rc), MATCH is already set to the name of the mailing list, and you can do further tests by just examining MATCH (or a variable you copy it into) instead of a repeating a complete head search. [I prefer to use the variable $LOGNAME rather than hard-coding my name because then others can use the code, and I can use it unchanged on sites where my logname is different, and if my logname is changed my procmailrc will keep up with it.] For example (I've separated the
conditions into two lines so that, per Bill's preferences, a mention of procmail in the head will get the message into the Procmail List folder, even if a match to $^TO_$LOGNAME is also present and appears sooner):

      MATCH # make sure it's unset going in
      :0
      * ! ()\/(procmail|list-(ABC|123|XYZ))
      * $ ! ^TO_$LOGNAME
      {
          INCLUDERC=$PMSRC/pm-ubecheck.rc
      }
      #   The next recipe has an E flag, so it will be examined
      #   only if the preceding one didn't match; thus if $MATCH was
      #   set inside pm-ubecheck.rc, it won't hurt anything here, and a
      #   value for $MATCH set in pm-ubecheck.rc
      #   won't be mistaken for a list name:
      :0E: # MATCH is non-null only if it matched a list name
      * MATCH ?? .
      $MATCH
      #   Remaining recipes will be read only for two types of mail:
      #   those that met $^TO_$LOGNAME but not any expected list
      #   name, and those that went through pm-ubecheck.rc but came out
      #   undelivered.

17.3 The basics of constructing a general purpose includerc [toc]

Here are my rules of thumb that I used when I started making a few general includerc scripts with procmail
          # @(#) pm-xxx.rc -- procmail script for ...
          # DOCS
          DEFINE ALL USER VARIABLES
          CODE
          # end of pm-xxx.rc
          # BAD
          THE_FLAG = ${THE_FLAG:-"yes"}
          # Good, when the includerc name is pm-xxrap.rc
          XX_SCRIPT_THE_FLAG = ${XX_SCRIPT_THE_FLAG:-"yes"}
          #   User option
          XX_SCRIPT_THE_FLAG = ${XX_SCRIPT_THE_FLAG:-"yes"}
          #   Private variable used later in the file.
          charset = "[a-z0-9]"        # alternatively xx_script_charset
          dummy = "start of pm-xxx.rc"
          ...
          dummy = "Now testing if we have control message XXX"
          :0
          * condition
          {
              dummy = "Now testing if the command is YYY"
              :0
              * condition
          }
          ...
          dummy="end of pm-xxx.rc"
         XX_SCRIPT_SUBJECT = `$FORMAIL -zxSubject:'
      Because the value may already be available prior your
      includerc. For example user may already have needed the
      *Subject* value and stored it in a variable
          SUBJECT = `$FORMAIL -zxSubject:'
          ...
          INCLUDERC = $PMSRC/pm-xxScript.rc
      and your code launches an unnecessary formail call. Instead,
      use the existing SUBJECT. The *XXX_SCRIPT_SUBJECT+!* construct
      is explained elsewhere in this document if you don't understand
      it. (#REF var init; #variable_initialisation_and_shell_call)
          SUBJECT = `$FORMAIL -zxSubject:'
          ...
          XX_SCRIPT_SUBJECT   = $SUBJECT            # Note this!
          INCLUDERC           = $PMSRC/pm-xxScript.rc
          [ in the pm-xxScript.rc variable definitions  ]
                  #   User should initialise the variable
                  #   XXX_SCRIPT_SUBJECT if he already has read the
                  #   subject.
                  #
                  :0h
                  *$ ${XX_SCRIPT_SUBJECT+!}
                  { XX_SCRIPT_SUBJECT = `$FORMAIL -zxSubject:` }
                  ...the rest of the code
          [ end of pm-xxScript.rc exerpt ]
          :0
          * condition
          * ! ^FROM_DAEMON
          * ! ^X-Loop: id-string-of-your-choice
          {
              # Ok, now we're clear to send automated reply
          }

17.4 An includerc skeleton [toc]

Here is my includerc file skeleton that i use in all my modules. The funny looking ".$" are for the text2html Perl filter. The documentation section can be ripped and turned into html very easily is you just keep the standard 4 tab column positions and start the description with "File id" and end it with "Change Log". The command to make the html is:
      % ripdoc.pls pm-yourfile.rc | t2html.pls >  pm-yourfile.html

These two perl files are available from my ftp direstory.

      # @(#) pm-yourfile.rc -- <one line description string here>
      # @(#) $Id: pm-tips.txt,v 1.31 1998/03/10 08:29:52 jaalto Exp $
      #
      #   File id
      #
      #       .Copyright (C)  1997-98 Foo Bar
      #       .$Contactid:    foo@bar.com $
      #       .$Created:      YYYY-MM $
      #       .$keywords:     procmail [subroutine|recipe] whatItDoes $
      #
      #       This code is free software in terms of GNU Gen. pub. Lic. v2 or later
      #       You can get newest version by sending email to maintainer with
      #       subject "send <FILENAME>"
      #
      #   Description
      #
      #       This subroutine Parses <what> from variable INPUT
      #
      #   Required settings
      #
      #       PMSRC must point to source directory of procmail code.
      #       This subroutine will include
      #
      #       o   pm-xxx.rc
      #       o   pm-yyy.rc
      #
      #   Call arguments (variables to set before calling)
      #
      #       o   INPUT, the string from where to parse...
      #       o   VAR1, description, default is ...
      #       o   VAR2, description, default is ...
      #
      #   Returned values
      #
      #       ERROR will have value "yes" if couldn't parse INPUT
      #       OUTPUT will have result after successful parse
      #
      #   Example usage
      #
      #           :0
      #           * condition\/.*
      #           {
      #               INPUT = $MATCH
      #               INCLUDERC = $PMSRC/pm-yourfile.rc
      #               #  OUTPUT has the result
      #           }
      #
      #   Change Log: (none)
      # ..................................................... &init ...
      dummy       = "init: pm-jacookie.rc start"

      #  Read the standard variable definitions if they are not
      #  yet defined
      :0
      * !  WSPC ?? [ ]
      { INCLUDERC = $PMSRC/pm-javar.rc }
      # .................................................... &input ...
      # - User configurable variables with reasonable defaults
      # - But parameters like "INPUT" that must be set beforehand
      #   are not mentioned here.
      VAR1    = $VAR1{VAR1:-"default1"}
      VAR2    = $VAR2{VAR2:-"default2"}
      # .................................................... &do-it ...
      dummy       = "subroutine: pm-yourfile.rc parses now that and that"
      <the code>
      dummy       = "subroutine: pm-yourfile.rc end."
      # end of pm-yourfile.rc

18.0 Mailing list server [toc]

18.1 Mailing list server pointers [toc]

"Meng on procmail" http://res2.resnet.upenn.edu/procmail

18.2 Simple Mailing list server [toc]

      # by Lars Hecking lhecking@nmrc.ucc.ie
      #
      MAJORDOM="majordomo-(users|docs|workers)"
      :0 w
      * $ ^(Sender|To|Cc):.*\/$MAJORDOM
      * $  MAJORDOM ?? ()\/$\MATCH
      | $APPNMAIL $LISTS/$MATCH

Here is another, by Brock Rozen brozen@netvoyage.net with ideas from [dan]

      # get the date in RFC822 format for insertion into some messages;
      # the "Resent-Date:" field is copied from the "Date:" field on
      # some systems. RFC1123 says "All mail software SHOULD use 4-digit
      # years in dates..."
      #
      LIST_NAME = "myList"
      LIST_ADDR = "$LSIT_NAME foo@bar.com"
      LIST_DATE = `date '+%a, %d %h %Y %H:%M:%S %Z'`
      LIST_ERR  = "$EMAIL"        # my admin address
      #   Sendmail ignores "To:" in the presence of "Resent-To:"
      #
      :0 fhw
       *$ !^X-List: $LIST_NAME
       *$ ^TO()$LIST_NAME
       |  $FORMAIL
              -A "X-List: $LIST_NAME"                                 \
              -I "Resent-TTo: $LIST_ADDR "                            \
              -i "Resent-Date: $LIST_DATE"                            \
              -I "Errors-To: $LIST_ERR"                               \
              -A "Precedence: bulk"                                   \
              -A "X-Loop: $COMSAT"
      :0 a
      ! -oi `cat /var/tmp/src/power-users.list`

19.0 Common troubles [toc]

I'm a new sys admin at my company, and I've been trying to set up Procmail as the mail filtering device (still using mail as the Mlocal) I've tried setting up the sendmail.cf to use Procmail as a filter (we want to use the current mailer as the local mailer) with one local procmail rc file. Procmail seems to work just fine if set up as the local mailer, but I'm still having problems setting it as the filter.

[John M Vinopal banshee@abattoir.com answers sendmail.cf]

      R$+ < @ $=a . > $*
          $#procmail $@ /etc/mail/procmailrc $: $1 < @ procmail > $3
      R$+ <@ procmail > $*                            $1 < @ resort.com .> $2

so this sends anything of the form foo@resort.com through procmail and rewrites it as foo@procmail. the procmail script reinjects it and it bypasses the call to procmail and then is rewritten back to foo@resort.com.

      /etc/mail/procmailrc:
      :0
      ! -oi -f "$@"

19.1 My ISP isn't very interested in installing procmail [toc]

I recently requested my ISP to install procmail, and they responded by saying no. Their main reason was they did not wish to incur the traffic from any/ all of their subscribers setting up mailing lists.

[Jon Lewis jlewis@inorganic5.chem.ufl.edu] Wouldn't you need write access to either /etc/aliases or /etc/procmailrc to setup mailing lists? Tell the ISP that procmail will greatly improve mail delivery and enable all users to filter out junkmail without ever seeing it. If they still refuse, find a better ISP.

19.2 My ISP has systemwide procmailrc; is this a good idea? [toc]

[eli] I, for one, do not like my ISPs to put stuff in /etc/procmailrc. There is precious little I will gain from that and plenty of opportunity for them to make mistakes I would not have. At one ISP I know people got upset at some sendmail level filtering of email. One of those upset is a habitual complain-to-spammer-ISP person. He did not want problems seeming to go away if they were really there. Another guy just didn't trust the filtering.

Writing a shell script that will give the user a .procmailrc which includercs a system wide shared procmailrc is the best way to do it. This forces the filtering to be "opt-in".

19.3 Procmail changes mailbox and directory permissions [toc]

By Ed McGuire emcguire@i2.com. Before procmail was used:
      > -rw-rw----   1 foo      mail  1127 Sep 11 07:33 foo

After:

      > -rw-------   1 foo      mail  1517 Sep 11 07:34 foo

when the UMASK environment variable is more restrictive than the mode of the mailbox, procmail changes the mode of the mailbox. The default value of UMASK is 077. If you want to preserve the group access to your mailbox, I think you can set UMASK to 007 in the rcfile:

      UMASK = 007

Further note: the above UMASK suggestion in .procmailrc does not work. See comment by Gjermund Sørseth gjermund@nextel.no

However the permissions on DEFAULT are handled before procmail even opens the .procmailrc, so changing the umask there will have no effect on the mailspool.

[Scott J. Kramer sjk@lux.com] it's documented in the MISCELLANEOUS of the procmail(1) man page:

If /var/mail/$LOGNAME already is a valid mailbox, but has got too loose permissions on it, procmail will correct this. To prevent procmail from doing this make sure the u+x bit is set.

Otherwise, you might notice a syslog message like:

procmail: Enforcing stricter permissions on "/var/mail/sjk"

when it chmod's the file to 600. As you've discovered, this is inconsistent with the SYSV (Solaris 2 anyway) default mailbox protection of 660, gid=6 (mail). I think that's an OS-dependent bug, with the `chmod u+x ...' as the workaround.

19.4 Changing mbox permission during compilation to 660 [toc]

it appears that mail that procmail delivers back into the spool it is writing out with owner.group user.mail and rights 600. To me this is reasonable. Mail delivered to the spool by /bin/mail is written out owner user, group mail 660.

When procmail delivers mail 600 later attempts at delivery with procmail removed from the .forward file fail: /bin/mail doesn't have permissions (or refuses to uses its permissions).

Since we have fickly and unruly users who will be moving their forwards in and out of place this is a problem.

Is the correct solution to force procmail to write 660? If so, how is this done? I assume in the section of config.h just below the warning about only messing with a section if you think you know what you are doing. I don't like feel like I know well enough what I'm doing to walk into that territory without some guidance.

[alan] I used to be the manager of the system support in the College of Engineering, at the University of California, Santa Barbara.

We supported about 1500 users from two HP 9000 G30's, using one of them as the centralized mailer. Mail was available via NFS exported /usr/spool/mail to over 200 workstations, of all kinds: SGI, HP, Sun, etc.

We replaced /bin/mail with procmail as the local mailer (Mlocal) because procmail correctly avoided NFS-locking problems, and it supported user-configurable mail filtering, without compromising system security.

In over two years subsequent to the change, we had no loss of mail due to procmail being used as the local mailer. If you wish further comment from the current system managers, send email to "postmaster@eci.ucsb.edu".

To answer your specific questions:

* you can configure the permissions directly, by changing one of the following defines in config.h:

      /* bit set on mailboxes when mail arrived */
      #define UPDATE_MASK     S_IXOTH
      /* if found set */
      #define OVERRIDE_MASK   (S_IXUSR|S_ISUID|S_ISGID|S_ISVTX)
      /* the permissions on the mailbox will be left untouched */
      #define INIT_UMASK      (S_IRWXG|S_IRWXO)       /* == 077 */
      #define GROUPW_UMASK    (INIT_UMASK&~S_IRWXG)   /* == 007 */

We did not find it necessary, however:

Write a special rule in sendmail.cf which delivers mail using Mprocmail instead of Mlocal when the destination user is in the special procmail user class.

This allows users who want procmail-direct delivery in spite of management worrying.

I set this up to test procmail delivery on our system before changing Mlocal to use procmail. We placed some "volunteer" users in the procmail class file, and they never had any problems (I was one of them).

19.5 The .forward file must be real file [toc]

I tried to make a softlink to ~/.forward, but then my procmail wouldn't run. When I made a real ~/.forward file, then it worked again. My question is -- why would procmail treat a link to a file any differently than the actual file itself?
      ln -s ~/.procmail/forward ~/.forward

[Werner Reisberger wr@tribe.ping.de] That's not a problem with procmail, this is an MTA issue. Due to security reasons sendmail will not deliver mail to files whicharesymlinks.

[david] procmail has restrictions on what permissions it will tolerate on an rcfile. For example (I'm just guessing here) it can tell whether it can read the target file but it cannot tell who might be able to write to it. This prevents a major security hole

You can make hard link to the file, since A hard link is completely indistinguishable from the original file. But note: a file hard-linked to two or more names is very distinguishable from a file with only one (hard) link, and procmail, for example, will not deliver to a plain folder that has two or more hard links.

You can also put the real file at ~/.forward and let ~/.procmail/forward be a symlink to

[< mikk0022@maroon.tc.umn.edu>] I suppose, the reasoning behind procmail's folder policy is that procmail locks the file by name, not inode. Hence it cannot guarantee mutual exclusion for access to a file which has multiple names.

My understanding of the .forward policy is that a symlink need not share the permissions of its target. Therefore somebody's .forward symlink could have proper permissions, while it's target could be writable by others. This would allow anybody with the write permissions to execute any program (potentially) from the user's forward file.

Two hard links share the same permission, so this argument doesn't hold.

19.6 Qmail: Procmail looks file from /var/spool/mail only [toc]

Procmail seems to want to do something in /var/spool/mail. But since I use qmail, I don't have a /var/spool/mail. Is there a way to have procmail not to create temp stuff there? [phil] Get procmail 3.11pre7 and uncomment and and correct for your local setup the MAILSPOOLHOME="/.mail" define in src/authenticate.c. Compile and install. t's relative to the user's home directory. Thus the name MAILSPOOLHOME.

[Ekkehard Knopp <knopp@rz-online.de] at the qmail-home-page you can find a patch for procmail-3.11.pre7 called procmail-maildir-patch. When you can't find it, I can send you a netmail. Have no problems with procmail and qmail. Works good.

19.7 Qmail: patch to procmail 3.11pre7 to work with Maildirs [toc]

[Jaye Mathisen mrcpu@cdsnet.net] On the www.qmail.org page is a patch that lets procmail 3.11pre7 work with Maildir's, (qmail's NFS safe delivery format), and not must Mailbox's.

Very useful. Really slows down delivery though. On my test box, just adding procmail to the delivery where all it did was deliver to the default mailbox, and no other rules whacked my speed test from something like 600,000 messages/day to about 180,000.

Killer. I suspect Procmail's locking of the Maildir 8 ways from Sunday is probably partially to blame.

19.8 Auto deleting old messages [toc]

Once the messages were in the correct folder, I would suggest using cron and mush. mush is able to manipulate messages on a wide variety of criteria, but works with them once they are already in the folder. Here is a command which would delete all messages that are one week old or more:
      pick -ago -1w | delete
IMHO, with procmail to do the pre-processing, and mush to do the post-processing, I have an unbeatable mail combination. --Brian Dockter brian@nds.com

I have em.shar on anonymous ftp://ucssun1.sdsu.edu/pub/unix --Ron Nash nash@sdsu.edu

19.9 Help, some idiot sent my address to 30 mailing lists [toc]

You can make procmail recipe to junk the incoming mail from the lists until you get the unsubscribe messages delivered to cancel your participation. You should complain to the list's maintainer that such things was even possible: The mailing list should have sent you a confirmation message with unique "participate ID number" that need to send back in order to subscription to take in effect.
      KILL_FILE = $PMSRC/.kill-immediately
      :0
      *$ ? test -f $KILL_FILE
      { KILL = `cat $KILL_FILE` }
      # Make sure KILL has value
      :0
      * KILL ?? [a-z]
      *$ KILL
      /dev/null

[sean] ...In the long haul, your best bet with dealing with this problem is to stamp out the offender - bring this harassment to the attention of their ISP and get their account closed. Repeat as necessary. Most of the mailing lists should have some record of the submission request. Even if forged, the abuser probably has their IP address in the headers somewhere (and if the person is actively subscribing your friend to so many lists and actually WORKING at covering their tracks, apparently you've REALLY crossed them). Most people who stoop to these immature harassment tactics aren't bright enough to fully cover their tracks.

Another alternative to having to manually deal with unsubs on certain lists is once you've identified filterable characteristics of the lists, BOUNCE them. Most semi-intelligent listserv implementations will unsub you if they get repeated bounces. Yea, not nice to the listserv maintainer - but then, if perhaps they'd implement a subscription verification system, it wouldn't have been a problem to begin with.

      :0
      * some_listmatching_condition
      {
          # may expose your .forward - but if you're bouncing lists,
          # it probably doesn't matter much.
          EXITCODE = 67
          # save header for examination.
          :0 h:
          bounce.log
      }

You've got a sticky situation. You can't simply ditch all unrecognized mail - you need to be able to review potential refuse first, and take action on anything which doesn't belong (because you certainly don't want to continue getting the non-wanted lists till the end of eternity - you should want to unsubscribe from them to simplify your mail).

19.10 Help, Procmail beeps and prints to my console [toc]

...when messages get filtered through procmail I get a beep and then first 10 lines or so are also sent to the console. I get a lot of messages so the beeps, and stuff on my screen is getting very annoying.

[sean] One or the other should do the trick (or both even): Go to your login file (what it is named depends on the shell you're using), and add:

      biff -n

Or/also, in your .procmailrc add:

      COMSAT = "no"

[manual] has information on the COMSAT variable. It also states (contrary to reasoning I gave in above) that COMSAT defaults to 'no' if you specify an rc file on the commandline (otherwise, it is on by default).(gnus-group-get-parameter

Doing this latter one should keep procmail from generating COMSAT/BIFF notifications, but would still leave your shell capable of receiving them, say, if you only processed certain mail in procmail manually or somesuch. Personally, I turn biff off AND set the COMSAT off. I read my mail when I read my mail, and I check it often enough (with a POP client at that).

19.11 Help, procmail dumps mail to console [toc]

...I have installed sendmail and procmail on my linux machine (latest version of slackware) it works ok, but procmail if run with -d $u dumps all mail after receiving immediately on the console with ---- more ---- I don't like it, a beep is ok, but I do not want all the garbage on my screen. Is there a way to tell procmail that I just want the mail in my mailbox (/var/spool/mail/$u) ? Thanks for the help!

[Xavier Beaudouin kiwi@oav.net] Check your /etc/inetd.conf for a in.comstat, add a '#' at the begining of the line, save the file and killall -HUP inetd. This should stop this ;-)

19.12 Help, corrupted From_ line in mailbox [toc]

Jeffrey S. Gilton jeffg@castlec.com 1998-02-11 in procmail mailing list " Solved the FFrom problem"

Thanks to everyone who responded to my questions about a problem where the From line was getting corrupted. Here I tell what was the real problem.

To recap, when our Caldera OpenLinux 1.1 system received multiple email messages very quickly, some messages would get multiple F's on the from line and then subsequent messages would be missing the F's.

Most responses said that it sounded like a file locking problem. Suggested solutions were to get the latest version of procmail or recompile our version so that it would look at the file locking mechanisms.

The funny thing was that three systems with new installs didn't exhibit the problem.

The file locking recommendation eventually led to the real problem. On a good system I would run our spam script (we spammed ourselves to trigger the problem) and everything would work. Using top I would see multiple instances of procmail running. Looking at the directory where the spool files were, I would see a spool_file.lock file get created and then go away.

Finally, I did the exact same thing on the system that wouldn't work. There I would see the multiple instances of procmail running but no lock file being created. I said to myself "Now that I know what is happening, the question is why."

It turned out to be a permission problem on the spool directory. On the system that worked, the permissions were rwxrwxr-x with the owner being root and the group being mail. On the system that didn't work, the permissions were rwxr-xr-x with the owner and group being root. This meant that procmail, which is run as mail couldn't write the directory file. We changed the broken system to rwxrwxr-x with owner root and group mail. The problem disappeared.

As I said, the suggestions about lock files were key. It guided our investigation until we found the real problem. I thank everyone who responded.

I've seen other posting about corruption of the From line. Perhaps you have the same problem.

[Christopher B. Smith cbsmith@envise.com] I had the exact same problem with my upgraded OpenLinux system. For the record, if you are running the imapd that comes with it, you should really set your permissions for the directory is as follows:

      chmod 1777 /var/mail/spool

I got that feedback from the guy who wrote imapd, and it works very well.

19.13 Directing user's mail to HOME instead of /var/spool/ [toc]

I have a need to direct all a single user's mail to a mailbox in his home directory, to $HOME/mailbox,
      # One possible solution, not perfect
      #
      UHOME       = /tmp_mnt/users
      UHOME_LIST  = "(login1|login2|login3)"
      *$ ^TO\/$UHOME_LIST@
      *   MATCH ?? ()\/[^@]+
      $UHOME/$MATCH
[era] Perhaps preferably use ^TO_ if you have Procmail 3.11pre7 or newer. This is the classical case of using Procmail where you really need the envelope recipient information. The headers are +not+ enough to determine who a message is for. If Procmail is your MDA, you can have this, but I'd still think something involving Sendmail would be more appropriate. For one thing, what if this user would suddenly really want to use Procmail? You can set DEFAULT and ORGMAIL for this one user in /etc/procmailrc to come around that, but the bottom line, as so many times before, is that Procmail might not be the right tool for this.

19.14 I can't see the SENDMAIL's response in LOGFILE [toc]

As the man page says, this should've written to my LOGFILE. It didn't. But it DID activate the pipe in the recipe. So what's up here?
      :0 hc
      * ? test -f $HOME/.vacation
      | LOG=| ($FORMAIL -r; echo $IM_NOT_HERE) | $SENDMAIL -t

[david] The man page says that a variable capture recipe assigns the standard output of the command to the variable. Since you are repiping the output of formail and echo to sendmail, sendmail sucks up the standard output of formail and sendmail. Sendmail itself does not write to standard output, so the stdout of ( $FORMAIL -r ; echo $IM_NOT_HERE ) | $SENDMAIL -t is nothing.

Thus you're assigning a null string to $LOG, and when procmail writes $LOG to the logfile you can't see a difference.

19.15 Compiling procmail and choosing locking scheme [toc]

General advise: Everything except dot locking is boken usually.

[stephen, 199607292139.XAA12433@hera.cuci.nl]. Remove fcntl() and lockf(), only allow flock() (or omit it completely) Kernel locks don't work. But that's all some programs use. Across a networked filesystem, lockf() doesn't work, fcntl() and flock() should, but they don't either because the lockd is buggy. Mailtool uses fcntl() but does it wrong, so that's another problem. The only thing that works on all platforms, all networks, all the time are .lock files.

Makefile refers to:

      # Uncomment (and change) if you think you know
      #LOCKINGTEST=100
      #        it better than the autoconf lockingtests.
      #        This will cause the lockingtests to be hotwired.
      #        100     to enable fcntl()
      #        010     to enable lockf()
      #        001     to enable flock()
      #        Or them together to get the desired combination.

config.h refers to:

      /*#define NO_fcntl_LOCK uncomment any of these three if you */
      /*#define NO_lockf_LOCK definitely do not want procmail to make */
      /*#define NO_flock_LOCK use of those kernel-locking methods */

20.0 Implementation details [toc]

20.1 What happens to mail if MDA Procmail fails [toc]

When procmail is the local mailing agent distributing e-mail to a user's $HOME and the target machine is 'down', where does the e-mail go? I was given the impression that the mail would be collected on the 'mailhub' in /usr/mail/BOGUS.xxx (Solaris system). It is not happening and we have the potential of losing mail.

[phil] I assume that by "target machine" you mean the NFS server for the given user's account. Procmail's attempt to read ~/.procmailrc will timeout, then when it tries to write to $DEFAULT (which you say is in their home directory) it'll time out (again) and return EX_CANTCREAT to sendmail. Sendmail will then presumably bounce the message.

Now, if sendmail is looking for .forward files in user home directories, then procmail will never be called, as sendmail will try to open the .forward file and consider it a transient error when it times out, causing the message to be queued for a later delivery attempt.

(Note: invoking procmail with the -t flag causes it to return EX_TEMPFAIL instead of EX_CANTCREAT. This would cause the message to be requeued. However, this is not generally recommended.)

20.2 Procmail reads entire 90Mb message into memory [toc]

...last week my workstation grind to a halt when procmail received a 90Mb Email message (ran out of memory). The point is, such message sizes are fine by me, as long as the system can handle it. Is there any way I could make procmail only read the headers of that message before scanning /etc/procmailrc/ ~/.procmailrc and acting on it? That way it wouldn't need to read the entire message into memory.

...Recently, I modified the sendmail.cf file to pipe messages through procmail before sending them to deliver, so that I can use system-wide procmail recipes for spam filtering. However, yesterday we had a client send a 22 megabyte e-mail message (on purpose, no less) and the system just came to its knees trying to deliver it to the user's mailbox.

[phil] Btw, All the versions of /bin/mail (or mail.local) that I've seen the source for either read the entire message into memory first or use a temp file. Depending on where temp files are located, a 90MB temp file may be just as bad as holding it in memory.

And, No, there isn't. Hacking it in would not be non-trivial, mainly because the current code runs with the assumption that the entire message is there, and determining when it actually needs to see the entire body (to do demand loading) would not be easy. Remember that a condition on the size of the message, ala

      :0
      * > 10000000
      /dev/null

would require the body to be read... It really is just better to simply have sendmail enforce the limit. You should be doing it there anyway to cut down on the totally trivial denial-of-service attacks and because it's more efficient. ...I am running procmail ver 3.11pre7 and I keep getting "out memory as i tried to allocate 8xxxxxx bytes.". I have over 100 meg available swap space so i have a difficult time understanding this. Is this a known error?

Procmail's memory allocation technique appears to non-optimal for some OS/libc combos, namely implementation of the libc system function realloc() (FreeBSD has been reported). It's conceivable that the configuration process could be enhanced to detect this system limitation to use a strategy more efficient on them. Don't hold your breath.

[ed] There is a patch available that should fix the problem for you. See the messages at <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/cgi-bin/w3glimpse /procmail?query=Albsmeier&errors=0&case=on&maxfiles=100&maxlines=30>.

20.3 Variables DEFAULT and ORGMAIL [toc]

According to the man pages, DEFAULT is defined as ORGMAIL...so if I redefine ORGMAIL, then DEFAULT should change as well, which doesn't help me. Any help on this would be appreciated

[david] DEFAULT is initially defined as equal to $ORGMAIL. Once procmail has started reading /etc/procmailrc (if it is the MDA) or your .procmailrc, you can change the value of either without affecting the other.

In fact, you can even set DEFAULT on the command line when you invoke procmail (I'm not sure about doing that with ORGMAIL, though), and that value will override its normal initial value equal to $ORGMAIL.

What if it is possible that dropping to DEFAULT fails due to disk full? Then you would better have another drop place in another file system. Pee bdf(1) or df(1) to find out the different mounted file systems.

      # Place this to the end of your .procmailrc and define
      # DEFAULT_SECONDARY
      :0 :
      $DEFAULT
      :0 E
      $DEFAULT_SECONDARY

If you deliver explicity to $DEFAULT, procmail treats it like any other save-to-folder recipe, and if the write fails, it continues reading recipes.

...If I had set the "deliver" destination as $ORGMAIL rather than $DEFAULT, would it have made any difference?

Nope. If you write a recipe for it, procmail just expands the variable and doesn't give a heck if it happens to be the same destination as DEFAULT or ORGMAIL. DEFAULT is special to procmail only when it uses it on its own after falling off the end of the rcfile; ORGMAIL is special only at startup (without -m) and when procmail falls off the end of the rcfile and finds that it cannot save the message to DEFAULT.

20.4 When DEFAULT cannot be mailed to [toc]

If procmail gets to the end of the rcfile without delivery (or without being directed to another rcfile by an INCLUDERC or HOST assignment), it assumes these:
      :0:
      $DEFAULT
      :0 e:
      $ORGMAIL

That is, it tries to deliver to $DEFAULT and if it can't, it tries $ORGMAIL. If that fails too ("deep, deep trouble" as Stephen says in the man page), it exits without delivery and reports failure to the MTA, which, depending on other factors, will either requeue the letter and try delivering later or will bounce it to the sender.

20.5 Variable DROPPRIVS [toc]

I have procmail invoked from a mailtable for a virtual domain. Presently that runs as root, inherited from sendmail. I'd like to have it run less privileged. I tried chown'ing the rc file to the user I want used and setting "DROPPRIVS=yes". That didn't do it. So I added "LOGNAME=user" and "USER=$LOGNAME" before the DROPPRIVS assignment and that didn't work.

[phil] DROPPRIVS only has an effect inside the /etc/procmailrc used when procmail is running in delivery mode (-d), not when it's running in mailfilter mode (-m). USER and LOGNAME have no effect on the working of DROPPRIVS, as procmail is just going to change to the uid/gid of the user specified on the command line after the -d. Your mailtable entry should be specifying the procmail mailer, which runs procmail in mailfilter mode.

DROPPRIVS only has an effect inside the /etc/procmailrc used when procmail is running in delivery mode (-d), not when it's running in mailfilter mode (-m). USER and LOGNAME have no effect on the working of DROPPRIVS, as procmail is just going to change to the uid/gid of the user specified on the command line after the -d. Your mailtable entry should be specifying the procmail mailer, which runs procmail in mailfilter mode.

If the following are true:

then procmail will assume the uid and gid of the owner of the rcfile. If the rcfile is actually a symlink, the procmail will assume the uid and gid of the link itself, not the underlying file. If your OS allows anyone to give away ownership of files with chown, the procmail adds the following restriction to those above:

      /etc/procmailrcs must be owned by root and mode 700.

20.6 Variable HOME [toc]

[david] Since procmail doesn't understand tilde, you have to use variable HOME instead.
      CONTENT   = `cat ~/file.txt`        # Won't work
      CONTENT   = `cat $HOME/file.txt`    # ok

But accessing other user's home is another story. You could change the SHELL temporarily to get procmail understand the reference, like this:

      SHELL   = /bin/csh
      CONTENT = `cat ~user/file.txt`
      SHELL   = /bin/sh                   # restore original setting

Because the tilde is in $SHELLMETAS, so when procmail sees a tilde, it will invoke a shell. It's better to skip the extra process of a shell and use the $HOME variable: put a symlink somewhere under your own home directory that points to the other user's file so that you can use the $HOME variable in your .procmailrc and avoid the shell invocation.

However, there are dangers on this too, because sysadm may move home directories and your symlinks may be out of date. If you expect such changes and broken links, then you could cache the needed home directories at time you need them:

      HOME_PHIL   = `ksh -c "echo ~phil"`
      HOME_ED     = `ksh -c "echo ~ed"`

20.7 Variable HOST [toc]

[phil] If a assignment to the "HOST" variable occurs where the assigned value doesn't equal the hostname of the machine on which procmail is running, procmail will stop reading the procmailrc, and if there are other procmailrcs specified on the command line, it will start reading them.

[david] .. about explanation of the HOST variable,

It goes back to the early days of procmail, before Stephen thought of INCLUDERC or the "var ?? condition" syntax. When people had to use different code based on which local host machine was processing a particular message, the method was to list a number of rcfiles on procmail's command line. The first one would start out with general code for all messages and all hosts and then have a

      HOST = some.specific.machine

assignment, followed by code for mail delivered on that machine. If the first nine characters of "some.specific.machine" matched the real value of $HOST, procmail would stay in that rcfile; on a mismatch, it would jump to the second rcfile named on the command line.

The second rcfile would probably be for another particular machine, so (unless it first had some universal code for all machines except the first one, or unless there were only two machines where procmail might run) right at the top it would have

      HOST = this.specific.machine

Again, a match for the first nine characters would keep procmail reading this rcfile, but a mismatch would make it jump to the next rcfile.

And so it went. An incorrect HOST assignment (note that "HOST" alone at- tempts to unset the variable, so it is always an incorrect assignment) in the last rcfile on the command line made procmail drop the message and exit. Since we almost never name more than one rcfile on the command line now, at- tempting to unset HOST in .procmailrc will have that effect.

I would guess that the only use of this original setup still around is in SmartList, where flist invokes procmail with a number of rcfiles on the command line and uses things like HOST=go.to.the.next.rcfile.now to move from one to the next. Also, procmail's -m facility (which didn't exist back then) is incompatible with using HOST to jump among rcfiles, because it requires naming exactly one rcfile on the command line.

Nowadays we can do something like this to use different rcfiles on different hosts:

      :0
      * HOST ?? ^^\/[^.]+
      { INCLUDERC = $HOME/.$MATCH.rc }

20.8 Variable LINEBUF [toc]

[manual] Length of the internal line buffers, cannot be set smaller than 128. All lines read from the rcfile should not exceed $LINEBUF characters before and after expansion. If not specified, it defaults to 2048. This limit, of course, does not apply to the mail itself...

[phil] Those 160 lines of condition are almost certainly overflowing LINEBUF. You should either a) use one of the innumerable recipes sent to the list demonstrating the use of fgrep; b) break it into multiple recipes; or c) increase LINEBUF. If you modify this list of domains regularly, then you should strongly consider (a), as (b) and (c) just put off it happening again.

LINEBUF only applies to lines from procmailrcs. You generally only have to worry about LINEBUF when you have a variable expansion or command expansion (backquotes) that doesn't have an obvious and reasonable bound on its size. procmail will avoid overrunning its LINEBUF length buffer when doing command expansions by ignoring the extra output, so you're safe there, as long as data truncation is fine. Variable expansion isn't checked like that, so you can cause procmail to coredump by doing something like:

      :0
      * ^Subject: \/.*
      |some-program $MATCH

then feeding procmail a message with a huge Subject: header field: since no shell meta characters appear in the action, the action line will be expanded and exec()ed by procmail directly instead of by the shell. On the other hand, the following is fine:

      :0
      * ^Subject: \/.*
      |some-program $MATCH

The semicolon forces a shell invocation, and the shell should be safe. If you /bin/sh can buffer overrun on variable expansion, then you're in more trouble than you know.

Action lines aren't the only place to watch your variable expansions. Variable assignments and condition lines that have a leading dollar sign also undergo expansion. For example, this isn't safe:

      SUBJECT = `$FORMAIL -x Subject:`
      NEWSUBJ = "Subject: $SUBJECT"

procmail won't buffer overrun in the first line, but a really long subject could cause the second to do so. The following should be safe:

      NEWSUBJ = "Subject: `$FORMAIL -x Subject:`"

but even then only if you're sure the shell is doing the expansion of NEWSUBJ.

Note that matching against the value of a variable (using the "var ??" condition special) is safe no matter what the size of the contents of the variable. The problem is when you interpolate the variable into something else.

Is there any easy way to know default LINEBUF value for specific procmail? I'm sure there's a much easier way, but this will work:

      #   Mitsuru Furukawa
      #
      $OUT    = $HOME/tmp/linebuf.lst
      :0 wc:  $OUT.lock
      *$ ! ? test -f $OUT
      | echo "$LINEBUF" > $OUT

[phil] If you examine the procmailrc manpage, you'll note that it lists fourteen variables (among them DEFAULT but not LINEBUF) whose values are reset in the environment by procmail, plus some additional ones like IFS, ENV, PWD, and PATH which come out of the top of config.h. Following this is a list of all of procmail's magic variables, including those fourteen. The idea is that while procmail has thirty magic variables, only fourteen of them are put into the environment by procmail.

The others may have default values, but they're 'input only': if what you're doing depends on one of the others having a certain value, then you should just go ahead and set it to that value. I know of only two ways to find out what value procmail is using by default: a) check the manpage (the manpages should show the correct default for the machine), or b) fire up your favourite debugger and hope that no one stripped the procmail binary.

There will be no error message when Procmail dumps core, even though the reason is apparently precisely that LINEBUF is being exceeded too much.

20.9 Variable TRAP [toc]

Here is one example how to write to the logfile, Be sure that you have preset all the variables, this just demonstrated the usage of TRAP. Pay attention to right use of single and double quotes if you pass the values to the shell. Like in this example where the /dev/ is removed from the FOLDER variable's value.
      TRAP = 'echo "
      FROM    $FROM
      TO/CC   $TO / $CC
      SUBJECT $SUBJECT
      FOLDER  $LASTFOLDER
      " | sed -e "s#FOLDER   /dev/#FOLDER   #g"'

And if your MUA expects the file to be touched before it sees new incoming mail, here is recipe by [david]:

      TRAP='touch -m $HOME/Mail/$LASTFOLDER' # with strong quotes

Place it early in your rcfile; then each recipe that saves to a directory can look simply like this, and the trap will take care of the touching:

      :0 flags # no local lockfile needed for save to directory
      * conditions
      directoryname/.

20.10 Variable UMASK [toc]

There is a better way to find out which folders contain new mails if you are using procmail to filter the mails. (This was a hack by one of my friends) procmail allows you to set UMASK on the folders. So before doing anything, set UMASK to 076, which means the perms will be -rwx-----x to any folder which receives mails. now using find -perm -001, you can print the folders which have new mails. the shell script which does this will also have to chmod o-x on all these folders.

How does this work? AFAIK umask only applies to new files created and not to appending to existing files which is what procmail essentially does, right? [era] Procmail does interpret UMASK this way, so this works, but I don't think it's a particularly good solution. It's actually hinted at in the documentation for UMASK in procmailrc(5). find is a rather heavy program to start up every time you want to look for mail. (Haven't done any timings, though.)

20.11 Sinking to /dev/null and h flag [toc]

When you drop something to /dev/null, use the h flag so that procmail does not unnecessary try to feed whole message there.
      :0 h
      * condition
      /dev/null

[phil] Procmail knows that it shouldn't create a locallock on /dev/null and that it shouldn't kernel lock /dev/null, and it knows to write it "raw" (no "From " escaping or appended newline). This means that procmail simply opens /dev/null, does it's write with one system call, and closes it.

I'm not sure if adding the 'h' flag makes a real difference on modern UNIX kernels. I suppose it depends on how optimized the write() data is and in particular, whether a user-space to kernel-space copy is required, or whether it's delayed. If it's delayed then the code for handling /dev/null would presumably not do it, and the size of the write wouldn't actually matter.

20.12 Performance difference between backtick and "|" recipe [toc]

Procmail sends the whole message to stdin whenever it sees backticks used. And if you use recipe, you can add the h flag to feed only the header to the program, and not the whole message. Let's ask academic question: Which one of the choices below is efficient?
      # Side effect: Do something with shell
      dummy = `echo hi there > some-file.txt`
      :0 hwic
      | echo "hi there" > some-file.txt

Procmail sends whole message to first line and only headers to second recipe. Answer: It doesn't matter. Either way procmail will make one write system call which will return 0 [bytes written] and off it goes. You should use the first one, because the latter affects the A and E flags later, first one is more clear overall.

While someone suggested following, it was rejected because it hurts performance more [stephen]. The cat process is useless and directing to dev null does not buy anything.

      :0 hwic
      | cat - /dev/null; echo "hi there" > some-file.txt

20.13 Procmail's temporary file names while writing file out [toc]

      /disk3/home/stanr/Mail 119) ls -la backup
      total 22
      drwx------  2 stanr         512 Nov 11 21:00 .
      drwx------  3 stanr        2560 Nov 11 21:11 ..
      -rw-------  1 stanr        3063 Nov  4 03:31 .nfsA0c724.4
      -rw-------  1 stanr        1780 Nov  3 23:00 .nfsA47da4.4
      -rw-------  1 stanr         849 Nov  3 23:22 .nfsA481f4.4
      -rw-------  1 stanr        2293 Nov 11 11:28 .nfsA737d4.4
      -rw-------  1 stanr        2598 Nov 11 20:39 msg.HCJB
      -rw-------  1 stanr        3127 Nov 11 21:00 msg.ICJB
      -rw-------  1 stanr        1884 Nov 11 20:45 msg.KCJB
      /disk3/home/stanr/Mail 120)

Any ideas what might make those .nfs* files? They contain
messages which seem to have been successfuly processed by
procmail in the later parts of the .procmailrc . However,
I doubt they'd ever get cleaned up if I didn't discover them.

[david] procmail uses temporary name while it is trying to write a file out, which it renames if things go well. I noticed that they all came from a 4h 31 span overnight; perhaps there was some systems work being done on your machine that screwed things up?

      :0 ic
      | cd backup && rm -f dummy `ls -t msg.* .nfs* | sed -e 1,3d`

[aaron] When a file that is being used by a program on an NFS client gets unlinked the NFS server renames it to something like that. It should then actually get unlinked when the file is closed, but it looks like the NFS server never got the close message for those.

[Keith Pyle keith@ibmoto.com] It is a result of using NFS, but the fault lies with the operating system on the NFS client. Keep in mind that NFS is stateless from the perspective of the NFS server. It keeps no information on how any file is being used. So, if a client tells the server to delete the file, the server deletes the file. This is not normally a problem, but many programs use a "trick" of Unix where the program opens a file, unlinks (deletes) it, and then continues to use the file. For all local files, the Unix kernel will not actually delete the file until all processes which have the file open exit. This works very well for temporary files.

If a client tells an NFS server to delete a file, it will delete the file immediately because of the stateless nature of NFS. The server has no way of knowing if any client still has the file open. To avoid this problem, if a client unlinks an open file on an NFS filesystem, the file is renamed to .nfs* where * is a unique value. The NFS client system is supposed to delete the .nfs* file when the process exits. However, there are some versions of Unix which do not do this well (e.g., AIX). If one of these OS's is used, it is common to find .nfs* files in various places. Therefore, it is a good idea for system administrators to periodically purge any .nfs* files over a certain age to eliminate the unsightly buildup in the filesystems.

20.14 Parameter $@ [toc]

[david] Of version 3.11pre7 procmail does not grok "$*", nor does it grok "$@" outside a pipe or forward action. The only way to get the positional parameters all quoted together into "$*" is something like this:

This doesn't work after all

      ARGS = `echo "$@"`

Procmail substitutes null for "$@" there. This works, though:

      :0 ir
      ARGS=|echo "$@"

After that you use "$ARGS" instead of "$*".

If you try to set ARGS with ARGS="$@", procmail doesn't substitute for "$@" and makes $ARGS null. If you try ARGS="$*" you get the literal text '$*'.

[phil] Of course, $ARGS differs greatly from $@ in that $ARGS will either be split on whitespace (if unquoted) or one argument (if double-quoted). $@ has the cool property that if double quoted it'll still be split into multiple arguments on the original argument boundaries. Since full-blown email addresses often have spaces, this distinction should not be casually dismissed. Note that while you might not type in such an addresses, your MUA's reply builder may.

20.15 Procmail variables are null terminated (detecting null string) [toc]

You can't catch null in the message. Eg if you try like this
      NUL=`/usr/5bin/echo "\000"`
      :0HB
      * $ $\NUL
      { LOG="Caught NUL" }

[phil] It won't work as expected. The problem is that environment variables (and therefore procmail variables) are null-terminated, and therefore cannot contain a null. The above line creates an empty variable. The solution is to use an inverted character class:

      NUL = `/usr/5bin/echo '[^\001-\377]'`

Note that procmail handles 8-bit characters except for null in procmailrcs, so you can use a literal control-A and octal-377 in your .procmailrc and save an echo and shell invocation right there.

20.16 FROM_DAEMON TO and TO_ and case-sensitiveness [toc]

[david] ^TO is case-insensitive by default. Stephen once told me something to the effect that tokens like ^TO, ^TO_, ^FROM_DAEMON, and ^FROM_MAILER are always case-insensitive, even if the recipe has the D flag, but I'm not positive that that was what he was saying, and we never pursued it. Certainly they are insensitive to case if there is no D.

[phil] If a regexp contains the ^FROM_DAEMON token, then that entire regexp is treated as case-insensitive. Other conditions in the recipe are not affected by this. The other tokens have no effect on the case-sensitivety. (This is with procmail 3.11pre4)

20.17 TO_ macro deciphered [toc]

If the regular expression contains ^TO_ it will be substituted by `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope |Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.])?)', which should catch all destination specifications containing a specific address.

[elijah] Let's rewrite that in perl /x format. See below.

What is the essential difference between TO and TO_ ?

The definition of the word boundary in block (E). See below. The ^TO_ expansion was added in v3.11pre4. You'll probably have to just ^TO (no '_'), which should work almost as well. The difference is that ^TOalias1@site may match something like while ^TO_ won't

      bobs-alias1@site
      /                       # [begin regexp]
       (                      # [Block (A)]
        ^                     # Anchor to start of line
         (                    # [Block (B)]
           (Original-)?       # Optionally proceed (C) with "Original-"
            (Resent-)?        # Optionally proceed (C) with "Resent-"
                    (         # [Block (C)]
                     To       # "To"
                    |Cc       # or "Cc"
                    |Bcc      # or "Bcc" {very rare in practice}
                    )         # [end (C)]
           | (                # [Block (D)]
              X-Envelope      # Proceed line 17 with "X-Envelope"
             |Apparently      # or "Apparently"
                 (-Resent)?   #    with optional "-Resent" appended
             )                # [end (D)]
                -To           # "-To" [line 14]
          )                   # [end (B)]
             :                # ":"
             (                # [Block (E)]
              .*              # any text
                # any single char other than letters, numbers,
                [^-a-zA-Z0-9_.]
                              # hyphen (-), underscore (_), or period (.)
             )                # [end (E)]
              ?               # Block (E) is optional
       )                      # [end (A)]
      /x                      # [end regexp]

20.18 TO_ macro and RFC 822 [toc]

According to RFC822 the From address can contains almost anything and the valid email address can be extracted from the linne as long as it is enclosed between <...>. Like foo@site.com.

[by Vikas Agnihotri vikas@insight.att.com] Block (E){see TO_ macro explanation} is there to slurp up that part. The <encapsulation> is not needed, and a case such as:

      From: "jester@fun.house" fool@aol.com

Will confuse a test for "^TO_jester@". Yes, I have seen people do that stuff, apparently not even maliciously. And although valid following is also valid

      From: someone@somewhere.com another@one.com

[Elijah continues] it will also confuse the regexp. I don't like the ^TO and ^TO_ macros for most things and typically use stuff like this:

      ^(Resent-)?(To|CC):.*[< ]{address}([ >]|$)

It still can be confused, but the things that will cause problems are fairly rare in practice. You might prefer something like this:

      ^(Resent-)?(To|CC):([^(]+([(].*[)])?)*[, <]{address}([, >]|$)

Which can correctly deal with

      To: (hatter@tea.party) {address}
      To: (fake {address}) bill.the.lizard@the.jury.box
      To: Alice alice@the.croquet.game, "W. Rabbit (late)"
              hare@small.hole, Gentle Reader <{address}>
      To: jabberwocky@vorpal.swords.r.us, duchess@the.croquet.game,
              chesire@no.where, {address}, dinah@meow.org

It will still fail for

      To: (fake <{address}>) mockturtle@tortoise.edu

If someone is malicious enough to send you such mail.

20.19 FROM_DAEMON deciphered [toc]

By [era]
      (([^).!:a-z0-9]   End of e-mail address token
        [-_a-z0-9]      Another alpha token
        )?              ... or maybe not;
       [%@>\t ]         Address separator -- either address@... or
                          <address> or a bare address with whitespace
                          around it
       [^<)]*           Skip as long as we don't run into another
                          bracketed address or end of comment
                          (presumably to prevent this from matching
                          inside parenthesized comments in the first
                          place)
       (\(.*\).*)?      Skip optional parenthesized comments and
                          anything after them if found
      )?                ... or maybe not; maybe we just see an ...
      $                  ... end of line instead
      ([^>]|$)           Uh, I should know what this is supposed to do,
                          but I can't quite remember what it's for. I
                          think it had something to do with continued
                          header lines ... Anyone?

21.0 Technical matters [toc]

21.1 List of exit codes [toc]

The right place to look is /usr/include/sysexits.h, but the codes should be pretty much standard. These ones are from HP-UX 10 and the code that you will be using mostly is 67. It tells to the sender of UBE to "piss off and delete me from your list; I'm not here"
      EX_OK          0        successful termination
      EX__BASE       64       base value for error messages
      EX_USAGE       64       command line usage error
      EX_DATAERR     65       data format error
      EX_NOINPUT     66       cannot open input
      EX_NOUSER      67       addressee unknown
      EX_NOHOST      68       host name unknown
      EX_UNAVAILABLE 69       service unavailable
      EX_SOFTWARE    70       internal software error
      EX_OSERR       71       system error (e.g., can't fork)
      EX_OSFILE      72       critical OS file missing
      EX_CANTCREAT   73       can't create (user) output file
      EX_IOERR       74       input/output error
      EX_TEMPFAIL    75       temp failure; user is invited to retry
      EX_PROTOCOL    76       remote error in protocol
      EX_NOPERM      77       permission denied

I thought that by using the EXITCODE, I would be assured that the email would be rejected but in fact Sendmail 8.8.7 attempts to deliver the "user unknown" to netcom.com, which is obviously wrong?

[sean] Sendmail accepts the message, then passes it on to Procmail, either as the local delivery agent, or via a .forward file (depending on your system's configration). Procmail says "gee, gotta lie about not being here" and rejects the message, when is sent back into the spool, and delivered according to who it appeared to come from.

Had SENDMAIL determined the user didn't exist (password file / aliases / virtusertable.txt), then it would have rejected the message right when the remote was doing SMTP RCPT. But the user WAS valid, and so it accepted it.

Another scenario is when you have a mail secondary, and your primary (where the user account and procmail are) is down. Some system goes to deliver mail to you, and resolves to your secondary -- which simply holds mail for your primary -- it hasn't a clue which user is valid and which isn't. Well, the (E)HELO (the system sending your primary the message) takes place during the SMTP session, the message is coming from your secondary - not from the original sender. At THAT point, if the user didn't exist, I believe sendmail would be issuing an unknown user error to the secondary, which in turn should mail that message back to who it thinks is the sender (I can't check my Bat book from where I'm at - any sendmail pros are welcome to elaborate).

is there any way at all to get around this (force the rejection at delivery time)? Better yet, is there some sort of check to make sure that the Received domain reasonably matches the From: domain?

You'd need to have a ruleset in your SMTP Daemon (generally Sendmail) to check domains (which WILL fail on many valid messages, BTW) and reject it WHILE the SMTP delivery session (actually, the negotiation) is in progress. By the time Procmail has the message, you've completely accepted the message, and any rejection you might hope to do is bouncing the mail - to the apparent sender.

Such is the problem with forged mail.

I wouldn't suggest this tactic for fighting spam anyway - so much of it is forged, and any bounce you send out simply uses up system resources on your machine and those on the system that was spoofed. Spammers don't REMOVE addresses from their lists (they want the lists to look as big as possible when they go to sell it to someone else) -- dome have even taken to GENERATING addresses at domains and sending messages to them with the assumption that somebody will probably have an account by that name ("bill@ joe@ dave@ ...").

Use procmail to trashbin (or otherwise file) all the junk and then manually take action on those which get through.

21.2 List of precedence codes [toc]

The priorities most sendmails recognize are following. The lower the priority, the later the message gets dealt with. A smart vacation program will ignore anything with a list, bulk, or junk priority. --Adam Shostack adam@bwh.harvard.edu
      0   first-class
      30  list
      60  bulk
      100 junk
      100 special-delivery

[dan] You should use bulk when you distribute files via File Server. The value in the Precedence: header says absolutely NOTHING about the contents of the message itself, it merely suggests a priority level to the mail system. From pp. 668 of the ORA "sendmail" book, bulk typically has a value of -200 while "junk" -100; thus a message with "Precedence: junk" will get higher priority than that of "Precedence: bulk" (although this can be changed in the sendmail.cf file).

Other than on heavily loaded machines, this value won't matter anyway, since all mail will be quickly processed.

[Stephen] ...Mail sent by a person is usually considered to be more important than autoreplies generated by some daemon. One way to express the lower priority of autoreplies is by adding a "Precedence: junk" field. This allows mail transport agents to make educated decisions about which mail to forward first (in case the mailqueue gets clogged).

Another point is: other autoreply services, like "vacation". They try to make an effort not to accidentally reply to a message generated by another deamon (e.g. yours). One way they detect this is by looking at the Precedence field. If it contains "junk", they know, this is not something we should respond to.

21.3 Sendmail and -t [toc]

sendmail -t tag reads To, Cc, Bcc, etc, for the recipient of the auto response?
      :0h
      * condition
      * !^X-Loop: foo@site\.com
      | ($FORMAIL -rA "X-Loop: foo@site.com" ) | sendmail -oi -t

[david] That's not a problem, because formail -r will not generate any Cc: or Bcc: headers unless you tell formail to add them. The only line where sendmail -t will look for recipients will be the To: line.

21.4 RFC22 Reply-To and formail problem with multiple recipients [toc]

[david] formail -r extracts only one return address, even when the Resent-Reply-To: or Reply-To: header contains more than one (and Stephen has told me he plans to leave it that way).

[dan]I understand these concerns; however RFC822 specifically allows for multiple recipients in a Reply-To: header. Given that, it seems that there should be a stright-forward way to deal with this in formail; even worse is that "formail" silently ignores multiple Reply-To: addresses.

For (a), wouldn't the Reply-To: (or Resent-Reply-To:) header supersede all other addresses and thus greatly simplify the searching? For (b), how about only using multiple (Resent-)Reply-To: addresses if formail's "-t" option is also specified? Or if you are really worried about mail-storms and existing recipes, a new formail option.

21.5 Procmail and IMAP server [toc]

[ed] See also ftp://ftp.cac.washington.edu/mail/imap.vs.pop ...This paper is an elaboration on a short note entitled "Comparing Two Approaches to Remote Mailbox Access: IMAP vs. POP", which was written in 1993 and recently updated. The purpose of this paper is to provide more extensive background on message access paradigms and protocols, and then to specifically compare the Internet's Post Office Protocol (POP) and the Internet Message Access Protocol (IMAP) in the context of "online" operation. I log in to a set of NFS-ed servers (or more precisely AFS-ed), and my mail comes into another server (not a part of this set) which is running IMAP. So sendmail never delivers mail into /var/mail/$LOGNAME on my login machines, and instead delivers to the IMAP server. Since sendmail never reads my .forward file in the home directory, I figure procmail never gets invoked.

You need a program which will fetch your e-mail from the IMAP server and then feed it to procmail. One such program that can do this is fetchmail. Check out http://locke.ccil.org/~esr/fetchmail/. The bad news is that once you do this, you probably won't be able to use an IMAP client to read your e-mail anymore. But that might be good news if you prefer an MUA that reads mbox files but doesn't grok IMAP.

21.6 Machine which processes mail [toc]

The just installed procmail does not work and I am assuming that sendmail is trying to run procmail on another machine. Is there anyway I could find out the appropriate ARCHITECTURE for that machine

[era] The following should tell you the name of the machine which processes mail for the machine you're asking about. You can then try to log in to that machine if you have shell access there, which is something you need to have in order to compile Procmail on it.

      nslookup -q=mx machine      # alternatively use host(1) command

If you don't have nslookup (doh) or don't understand what it says, try adding this to your .forward

      "|uname -a >/full/path/to/home/.uname.out"

i.e. this should be there in +addition+ to what else you do. Otherwise this will lose your mail thoroughly, since it reads the mail but doesn't save it anywhere. You might want to save a copy of all incoming mail to a safety mailbox, too, just in case. Like so:

      /full/path/to/home/safetymailbox
      |"uname -a >/full/path/to/home/.uname.out"
      |"IFS=' '&& exec /usr/local/bin/procmail -Yf- || exit 75"

If you try this, it is +very+ +important+ that the file safetymailbox exists and is writable. (man 5 forward if you have that -- I don't seem to have this manual page on systems with newish versions of sendmail, is that correct?)

Try the uname command (and/or read the manual) to see what you should expect to find in the file .uname.out

21.7 Compiling procmail and MAILSPOOLHOME [toc]

I am compiling 3.11pre7 on a new system and have a couple of questions. I edited the makefile to be the home directory "/home/a/abc" for example. I defined MAILSPOOLHOME as "/mail". The incoming mail is actually stored in "/usr/mail/abc". When I pipe test messages through procmail (using "procmail</usr/mail/abc"), rather than them ending up in my inbox, they end up in a mailbox called "msg.gs.KB". What on earth did I goof up? As I sit here and think about this, should MAILSPOOLHASH be set to 1 instead of 0?

[philip]If incoming mail is supposed to be stored in /usr/mail/loginnamehere, then you should not define MAILSPOOLHOME at all, but rather define MAILSPOOLDIR to "/usr/mail/" and leave MAILSPOOLHASH as 0. Defining MAILSPOOLHOME causes mail to be delivered to insides each user's home directory, which does not appear to be what you want. MAILSPOOLHASH causes addition levels of hierarchy in the spool directory to be created, thus avoiding the 'fat slow directory' problem.


22.0 Different version features and bugs [toc]

22.1 Procmail doesn't allow filing to multiple mboxes [toc]

It is an unfortunate inconsistency that procmail will allow filing to multiple directories, but NOT to multiple file-folders. Sad but true.
      :0          # This works fine ....
      mhfolder1/. mhfolder2/. mhfolder3/.

22.2 TIMEOUT has its pecularities [toc]

[jari] Indeed, something weird is happening. For this procmail Terminated the sleep but than hangs forever, never executing any other lines.
      TIMEOUT = 2
      status  = `sleep 60 && echo ok`
      :0
      *! status ?? ok
      {
          dummy = "Error, do something"
      }

And this terminates sleep, but also sinks to /dev/null. Uhm. And what's that gnost over there: Executing "" ?

      :0 w
      * ? `sleep 60`
      /dev/null
      #   LASTFOLDER got set if delivery took place above
      :0
      * ! LASTFOLDER ?? /dev/null
      {
          timeout_happened
      }
      procmail: Assigning "TIMEOUT=2"
      procmail: Executing "sleep,60"
      procmail: [28830] Fri Jan 23 15:51:44 1998
      procmail: Timeout, terminating "sleep"
      procmail: Executing ""
      procmail: Assigning "LASTFOLDER="
      procmail: Match on ""
      procmail: Assigning "LASTFOLDER=/dev/null"
      procmail: Opening "/dev/null"

22.3 Variable capture |= is unreliable [toc]

[david] In 3.11pre7 Following lowercase() recipe fails. it is clobbering the value of $variable and giving back null. The result is the same on both the SysV and the BSD platforms here, and testing it with another command that has no characters from $SHELLMETAS has the same poor results, so invoking a shell is not the problem.

The version of tr there does not require bradkets, so that is not the problem. And I swear on a stack of manuals that it worked properly all this time, but in my testing today it consistently fails.

variable = "VaRiAble"

      :0 Dbir
      * variable ?? [A-Z]
      variable=| echo $variable | tr A-Z a-z

This works, and I'm using it for now:

      :0 D
      * variable ?? [A-Z]
      { variable = `echo $variable | tr A-Z a-z` }

If someone knows a good reason that the first syntax clobbers the previous value of the variable on the way in but the second allows it to be used in the command, please share. The whole thing looks six-legged to me.

22.4 Forwarding with ! token and -t switch [toc]

[david] You very likely do not need "! -oi -t" there because (1) I don't see how uuencode is going to output a line consisting of a lone period and nothing else and (2) in more recent versions of procmail, if $SENDMAIL's basename begins with "sendmail", procmail translates "!" to "| $SENDMAIL -oi". So don't do this:
      :0
      ! -t foo@some.net

To send message to list of people it is easiest to do with

      :0
      ! `cat $HOME/.mailing-list`

22.5 Flag c is different in 3.11pre7 [toc]

[david] Before 3.11pre7, adding a c flag to a filtering or variable capture recipe when LOGABSTRACT=all logged the recipe, even though it was non-delivering. The fix seems to have gone overboard; now c with `LOGABSTRACT=all` still logs saves to folders but doesn't log pipe or forward actions. Here is a sample procmail exemplifying this problem:
      LOGFILE=/tmp/log
      LOGABSTRACT=all
      :0c
      ! cassaign@iml.univ-mrs.fr
      :0c:
      /tmp/file
      :0
      /dev/null
With 3.11pre4, the three actions were logged; with 3.11pre7, the first one isn't.

22.6 FROM_MAILER has changed between 3.10 and 3.11pre7 [toc]

[elijah] reports: try this with vi(1)
      :r! cat p-t
              VERBOSE=yes
              :0
              * ()\/^FROM_MAILER
              {
                      LOG=$MATCH
              }
              :0
              /dev/null
      :r! /usr/local/bin/procmail -m p-t < to-test1

Try feeding message with From field

      "From: wje@netcom.com (William J. Evans; mail protected by sp\
       amgard{tm})"

old

      ()\/(^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )(.*[

new

      ()\/(^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*

Note that the first ".*" has been changed to "[^>]*"; the addition of several account names such as "ops", "MMGR", etc; and the much more elaborate end condition "([^).!:a-z0-9].*)?$[^>]" versus "(([^).!:a-z0-9][-_a-z0-9]*)?[%@> ][^<)]*(\(.*\).*)?)?$([^>]|$)".

22.7 V3.10 and regexps bug [toc]

I have noticed (pm 3.10) that recipes without regexs will sometimes haywire procmail in such a way that the message is delivered as msg.xxx in the home directory.

[philip] Procmail version 3.10 doesn't know to skip blank lines, and I believe it'll treat them as requests to deliver to $MAILDIR, which defaults to $HOME. This was fixed in 3.11pre3

22.8 V3.11pre7 and MATCH strips all leading blank lines in body [toc]

[phil] This is so that a match that starts with a ^ doesn't include the newline, ala the condition:
      * SOMEVAR ?? blah blah blah \/^(whatever|something)

With enough effort, however, you can work around it. Don't barf:

      :0
      * BODYLINES ?? ^^.*$\/(.*$)+
      { REMAININGLINES = $MATCH }
      :0E
      { REMAININGLINES }
      :0
      * BODYLINES ?? ^^\/.*$
      { THISLINE = $MATCH }
      :0E
      { THISLINE }
      # Okay, make sure THISLINE didn't lose a leading newline
      :0
      * ! BODYLINES ?? $ ^^$\THISLINE
      { THISLINE = "
      $THISLINE" }
      # Now check REMAININGLINES
      :0
      * ! BODYLINES ?? $ ^^$\THISLINE($)$\REMAININGLINES^^
      { REMAININGLINES = "
      $REMAININGLINES" }

LOG="REMAININGLINES = |$REMAININGLINES| THISLINE = |$THISLINE| "

HOWEVER... this will exhibit a latent procmail bug, where when procmail does the stripping (with those two lines of C shown above) it fails to decrease the length of the match, so that when the first line of what is being matched is empty (i.e., what is being matched starts with a newline), then you get the first character of the second line in the match. [patch posted 1997-10-20 to procmail mailing list]

22.9 Match \/ bug in 3.11pre7 [toc]

James Waldby j-waldby@uiuc.edu, 16 Nov 1997 in procmail mailing list

the ()\/.*$?etc and \\/.*$?etc patterns match all of the messages. And of course it turned out that the two messages that didn't match the \/.*$?etc pattern had no / in the message body, and of course "* \/.*$?"etc left $MATCH unchanged.

      Now, one more question. With following in .procmailrc :
      :0B
      * ^^\/.*$?.*$?.*$?
      { LOG=" a |$MATCH| a " }
      :0B
      * ()\/.*$?.*$?.*$?
      { LOG=" b |$MATCH| b " }

      :0B
      * \\/.*$?.*$?.*$?
      { LOG=" c |$MATCH| c " }

after the command "formail < mm -s procmail" with one message in mm with a message body that begins:

      L1
      L2
      L3
      L4

the log file said

      a | L1
      L2
      L3| a  b | L1
      L2| b  c | L1
      L2| c

which means, after \/.*$?.*$?.*$? $MATCH was first three lines of body, but after ()\/.*$?.*$?.*$? or \\/.*$?.*$?.*$? patterns, only first two lines.

[era] Fascinatingly, this looks like a new bug in Procmail. Even with the question marks taken out, I get the same result. (And in none of the cases is the trailing newline included in the log message.)

[phil] Okay here goes: procmail starts the body with an implicit newline to handle the matching of ^ and ^^ (and a leading $ if you wanted by confusing). With the first recipe, the newline is 'eaten' by the ^^ (a single caret would work just as well). With the others however, procmail will match zero instances of anything in the first ".*" so that it can match that initial newline with the first dollar sign. Procmail always takes the match that starts the earliest.

To prove that this is the case, try the following recipe:

      :0B
      * ()\/.+$?.*$?.*$?
      { LOG = " b |$MATCH| b " }

With the '+' in there instead of a '*', it has to skip past the leading newline to the first real line. Of course, the above will skip all blank lines, not just the implicit leading one, so it might not be exactly what you want to do.

To sum it up: explicit anchoring is a Good Thing.

[era]Then why does this also grab only two lines?

      :0B
      * ()\/$.*$.*$
      { LOG=" d |$MATCH| d " }

And/or why isn't this leading newline logged?

[phil] Because that leading newline isn't there. It's purely a figment of the regexp engine's imagination.

[david] Philip has already answered the question, but I'd like to supply an answer from a different perspective:

The initial "$" is matched to a putative newline, not to a real one that is actually in the text. Putative newlines are never included in $MATCH.

Comparably, in more recent versions of procmail that do not drop a closing newline from $MATCH,

      * HOST ?? ^^\/.+$

will set $MATCH to a value that ends with the last character of $HOST, not with an appended newline, because the $ matched a putative newline, not a real one.

That's why the above recipe doesn't result in a value for $MATCH that starts with a newline; that first $ is matched to a putative newline, which procmail never puts into $MATCH.

22.10 Versions 3.11pre5 and pre6 are dangerous [toc]

[phil] These versions were bad enough to make a lot of people reinstall 3.11pre4. pre5 coredumped because of a missing return statement, and pre6 swapped your uid with your gid, generally resulting in the bouncing of all your mail.

23.0 Smartlist [toc]

23.1 Installation trouble: getparams [toc]

Does anyone out there know what the error means when it occurs when installing Smartlist? Procmail is already installed on the system (by the sysops)
      make: *** No rule to make target getparams

[Hal Wine] Yes, it means that you haven't built procmail yet. Build procmail first, then execute Smartlist's install.sh script. You need to get and untar the procmail sources in your own directory, then get and untar the corresponding Smartlist sources in the same directory tree.

Then build (but don't install) procmail, then install Smartlist using the install.sh script. Smartlist uses and builds files in the Procmail source tree, so that has to be done first

[sysops] don't have the time to mess with getting Smartlist running. Obviously,when I attempt to install Smartlist, it's not finding Procmail. What do I have to do to get the install program to find Procmail?

If the sysops aren't going to install Smartlist, read all the sections in Manual about non-root use of Smartlist (it works fine).

You should make sure that smartlist, when invoked, uses the matching version of procmail. This means either use the version of Smartlist that matches the sysop installed version of procmail, or set up your PATH such that you use the version you built. If you use your own version, make sure it uses the same locking strategies as the "official" version.


24.0 Additional procmail or MUA software [toc]

24.1 Ftpsearch [toc]

If the mentioned URLs are not alive any more, please use ftp search located at http://ftpsearch.ntnu.no/.

24.2 Comstat to handle multiple mailboxes [toc]

ftp://ftp.belwue.de/pub/unix/xcomsat.tar.gz

24.3 Elm and pgp support (Mutt) [toc]

"Michael Elkins' me@cs.hmc.edu ftp directory" ftp://ftp.cs.hmc.edu/pub/me/ http://www.cs.hmc.edu/~me/mutt/ http://www.cs.hmc.edu/~me/elm/me.html

[EXT Liviu Daia daia@stoilow.imar.ro mentions that] ...Provided that you configure it correctly, it will use lynx to convert HTML attachments to plain text automatically, and display them in its pager. You can reply in plain text to those attachments, and you can also do the same thing with any kind of attachment for which you give it a way to convert to plain text. It's definitely not aimed at the beginner level like Pine, but it's far more powerful too. Also GPL-ed.

24.4 MH sites [toc]

"New MH" ftp://ftp.math.gatech.edu/pub/nmh/nmh.tar.gz http://www.math.gatech.edu/nmh/

25.0 Additional procmail software for Emacs [toc]

25.1 What is Emacs [toc]

...first thing I learned on a Unix machine was that vi is a text editor and Emacs is a way of life. --David W. Tamkin dattier@wwa.com

Emacs refers to a programming platform (it's not only a text editor, or a programming editor, but it does almost everything you tell it to do except make your coffee) which can be found almost in any Unix platform. Nowadays Emacs is also available for the PC platform too. There are two flavours to choose from: Emacs, maintained by the FSF (Free Software Foundation), and XEmacs, sometimes called "Emacs the next generation", because it has a better graphical user interface (gui) and internally advanced OO design (it can highlight on tty, whereas Emacs can't). XEmacs is being maintained by group of programming wizards.

See ftp://cs.uta.fi/pub/ssjaaa/elisp.html

Emacs add-in packages are lisp and the lisp file extension is .el . Inside each package one finds instructions how to use and how to install the package into Emacs.

25.2 Emacs and procmail mode and Lint [toc]

There is available procmail mode for Emacs which can also lint procmail recipes. People familiar with C-coding know the lint, which is a rigorous code syntax checker. You can read about this Emacs mode from
      ftp://cs.uta.fi/pub/ssjaaa/ema-tiny.html --> TinyPm

And to use it, you need the lisp libraries from

      ftp://cs.uta.fi/pub/ssjaaa/tiny-tools.tar.gz

The tgz kit has the last released version, which may already be old, so get the latest update from the FileServer by sending following message. You can also order the procmail lint test file with "send pm-lint.rc"

      To: jari.aalto@poboxes.com
      Subject: send tinypm.el

25.3 Emacs and lining up backslashes [toc]

Some time ago I wrote makefile to my Emacs tgz kit and as as side effect I got frustrated with the use of backslashes within the make rules. This backslash problem is universal in almost every programming language, (eg. C/C++ macros) including procmail, where you sometimes use echo a lot,
      :0 h
      * condition
      | ( cat -; \
          echo "And the body text"; \
          echo "follows here with"; \
          echo "these echoes"; \
          ) | \
          sendmail -t

Autch. That looks bad. Any line up tool anywhere? Yes, get my Emacs tiny-tools.tar.gz and look at the file tinymy.el which defines function timy-backslash-fix-paragraph. Here is piece of lisp code that you stick to your .emacs to make the key Control-\ to run the backslash fix

      (global-set-key "\C-\\"  'my-backslash-default-column)
      (defun my-backslash-default-column (&optional arg)
        "Col 76."
        (interactive "*P")
        (autoload 'timy-backslash-fix-paragraph "tinymy" t t)
        (timy-backslash-fix-paragraph (or arg 76) 'verb)
        )

After that, you just put your cursor inside paragraph and hit Control-\ to get the following line up effect. The column position is best to set near right margin, but not further than a regular page's maximum column 80.

      :0 h
      * condition
      | ( cat -;                                                      \
          echo "And the body text";                                   \
          echo "follows here with";                                   \
          echo "these echoes";                                        \
          ) |                                                         \
          sendmail -t

Guys, Emacs is available for every platform, even for Windows95 and WindowsNT. So, go ahead and install one if you haven't already. Setting up your personalised Emacs may require steep learning curve, but it's well worth the effort :-)

25.4 Emacs and browsing mailbox files [toc]

If you use Gnus as your MUA, then you already can browse mailboxes. If you just want to read some arbitrary mailbox without firing up Gnus, then you can use package TinyMbx.el. It defines a special mailbox reading minor mode that is activated when you visit mailbox file. You can copy, file, delete messages or mail the author of the current message. There is no separate summary buffer as in RMAIL, but you move from message to another with PgUp and PgDown keys.
      ftp://cs.uta.fi/pub/ssjaaa/tiny-tools.tar.gz

25.5 Emacs and live-find-file.el [toc]

http://www.cs.indiana.edu:800/LCD/cover.html --> live

You definitely want this if you browse procmail log files. This package updates the logfile buffers whenever they change on disk. The package actually launches tail -f for every file you want to view "live".

It is unfortunate that the package won't run the find-file-hooks by default, so put the below code to your .emacs and it makes the package behave like ordinary find-file and you can set up the fontifications;

      (defconst live-file-notify nil)   ;; Notify is too slow; turn off
      (defadvice live-find-file (after my act)
        (run-hooks 'find-file-hooks))

25.6 Emacs and font-lock.el [toc]

Font-lock comes standard in Emacs releases. You can colorize your procmailrc if you use font-lock. There is this local-variables
way to set up the font-lock variables, but please don't use them, because they force the file to use the values. If you give the file to someone else, you can bet your paycheck that he doesn't like your colors. You don't know what I'm talking about? I mean these lines that are at the end of file:
      # Local Variables:
      # mode:text
      # eval: (progn (setq font-lock-keywords ... values...)
      # End:

Here is better lisp'ish solution, Put this to your .emacs and reload it with M-x load-file. When you load file that matches procmailrc or procmail.log the font-lock attributes for the file get set. Change the regexp if your procmail filenames are different.

      (add-hook 'find-file-hooks 'my-find-file-hooks)
      (defun 'my-find-file-hooks ()
        (require 'cl)
        ;;   colors are available to Emacs only under X window
        (when (and window-system
                   (fboundp 'font-lock-mode) ;; make sure this is present
                   )
          (cond
           ((string-match "procmailrc" buffer-file-name)
            (setq font-lock-keywords
              (list
                '("#.*"             . font-lock-comment-face)
                '("^[\t ]*:.*"      . font-lock-type-face)
                '("[A-Za-z_]+=.*"   . font-lock-keyword-face)
                '("^\\*.*"          . font-lock-doc-string-face)
                ))
            ;; Turn the fontifying mode on if it's not on already
            ;;
            (unless font-lock-mode (font-lock-mode 1))
            )

           ((string-match "procmail.log" buffer-file-name)
            ;;  The strings "" in the procmail log makes font-lock crazy,
            ;;  We kill the String class from the buffer with
            ;;  these statements.
            ;;
            (let ((table (make-syntax-table)))
              (modify-syntax-entry ?\" "_" table) ;; Change "
              (set-syntax-table table))
            (setq font-lock-keywords
              (list
               (cons "Opening "       'font-lock-type-face)
               (cons ".* error .*"    'font-lock-keyword-face)
               (cons "Folder:"        'font-lock-type-face)
               ))
            (unless font-lock-mode (font-lock-mode 1))
            ))
          ))
      ;; End code

26.0 Procmail, Emacs and Gnus [toc]

26.1 Gnus pointers [toc]

"Gnus" http://www.ifi.uio.no/~larsi

"Gnus manual: procmail" http://www.ifi.uio.no/~larsi/www.gnus.org/manual/gnus_6.html#IDX1501

"Gnus Hypertext search archive" http://www.miranova.com/gnus-list/

26.2 Why use procmail with Gnus [toc]

Gnus has very powerful mail split methods and the normal reaction against the need of procmail is: "Hey, Gnus does my mail splitting, I don't need procmail". The difference between Gnus and procmail splitting is quite easily explained: you want procmail to preprocess the mail before gnus ever sees it and then postprocess the mail with Gnus (read, move mail from the inbox to another)

Case1: Gnus and regular mailbox, no procmail. Gnus reads directly one huge mailbox where all incoming messages are. When the user starts Gnus, it slurps the whole mailbox and starts splitting the mail according to the its split rules.

      mail -> $MAIL --> fire up Gnus  --> split1.mbx split2.mbx ....

Case2: procmail and Gnus. The email is always delivered to procmail first. Procmail is free to put the mail anywhere or just let it drop to the user's default inbox, usually pointed by environment variable $MAIL.

      mail -> procmail                --> Post processing with Gnus
              [the  ~/Mail/spool]
              --> split1.mbx
              --> split2.mbx
              [The default procmail rule drops to inbox]
              --> $MAIL

You can let gnus to process the messages further: like moving messages from one inbox to another.

Summary

So, let procmail drop messages to their inboxes and Gnus to possibly "fine process" these inboxes.

26.3 Setting up gnus for procmail - Basics [toc]

Procmail and Gnus communicate with each other very nicely when you use the mail backends like: nnml, nnmh and nnfolder. See Emacs info Gnus::Node: Select Methods for more.

Here are step by step instructions for reading the mail with nnml mail backend. We suppose that you have the following definition in your procmailrc so that the incoming mail is delivered to the right
directory.

The important point here is that the name of the gnus nnml group is identical; except the .spool suffix, to the spool file where procmail writes. So if you write to list.procmail.spool, the group name in gnus is named nnml:list.procmail

      #  .procmailrc excerpt
      #
      MAILDIR     = $HOME/Mail
      SPOOL       = $MAILDIR/spool
      #  The file name must be list.xxxxx.spool in order to
      #  nnml to work in Gnus.Define procmail mailing list
      #
      PROCMAIL_SPOOL = $SPOOL/list.procmail.spool
      PROCMAIL_RE    = "procmail@Informatik.RWTH-Aachen.DE"
      #  GNUS must have unique message headers, generate one
      #  if it isn't there. By Joe Hildebrand hildjj@fuentez.com
      #
      :0 fh
      | $FORMAIL -a Message-Id: -a "Subject: (None)"
      :0:
      *$ ^From:.*$PROCMAIL_RE
      $PROCMAIL_SPOOL
  1. Copy the Lisp code below to your ~/.gnus
  2. Start Gnus with M-x gnus-no-server (M-x means ESC followed by x). You will see *Group* buffer to appear.
  3. Make the new group with G m list.procmail RET nnml RET. You can read the group as usual and query new mail with g command.
          (setq
           gnus-secondary-select-methods '((nnml ""))
           ;; See also nnmail-procmail-suffix which is .spool by
           ;; default
           ;;
           nnmail-use-procmail        t
           nnmail-spool-file          'procmail
           nnmail-procmail-directory  "~/Mail/spool/"
           nnmail-delete-incoming     t
           )
    

    And then I have procmail always deliver to ~/Mail/spool/. If you add more inboxes, create them inside gnus Group buffer with G m.

    26.4 Gnus for procmail - More gnus [toc]

    Okay, Let's continue our journey in Emacs. What you read previously was the minimum you needed to get your Gnus to read procmail delivered files. However, if you're new to Gnus, here are some more tips and basic instructions. The best advice I can give is that you go to each buffer: In group, you press G C-h and in Summary C-h m and print the commands to printer that you see listed.

    In Group buffer

    • When you press g to get new mail to these groups, the group _disappears_ if there is no mail. If you want the group to be permanently visible, then set
          (setq gnus-permanently-visible-groups  "^nnml\\|^nnfolder")
    
          In emergency, press L to list all groups.
    
    • If you made a mistake and wrote list.procmaill with an extra l accidentally in the group name, use G r to rename group.
    • Raise or lower the priority of your procmail mail groups with S l. Values 1 or 2 or 3 are good. Consider reserving 1 for your primary mail and 2 and 3 for mailing lists.
    • When you exit a group and have read some articles, they won't show up next time you go there. But by giving prefix argument before entering the group with SPC, Gnus will list all read articles. You give the command like C-u SPC, where C-u is the prefix argument.

    Settings

    • You want gnus to tell you everything it does
          (setq gnus-verbose 10)  ;; 0..10
    
    • You expire articles (get permanently rid of them) with the 'E' command in the *Summary* buffer. The default expiry time is 7 days. You can define the expiry time in days with
          (setq nnmail-expiry-wait 7)
    
    • If you read mailing lists, you want automatic expiry when you have read the article. Use the following to set up groups that use this automatic expiration.
          (setq gnus-auto-expirable-newsgroups
              (concat
               "procmail"
               "\\|other-list"
               "\\|and-some-other-list"
               ))
    

    • B e in the Summary buffer expires current expirable articles.
    • If you want to kill an article; permanently remove it from disk, use B delete.
    • If you want to mark an article as persistent (never expires), use *
    • You don't want these mail groups cached because mail is already in "cache" format. The cache is needed only when you read newsgroups and want to store messages locally.
          (setq gnus-uncacheable-groups "^nn\\(virtual\\|m[hlk]\\|db\\)")
    

    26.5 Emacs and Gnus -- Fiddling with spool files [toc]

    Well, to tell you the truth, managing Gnus is scary at first: You make lot of mistakes along the way or otherwise change your mind about group names and so on. It's tricky task to move mail from one directory to another if you decide to rename the spool file name where procmail is putting the filtered mail.

    Let's take an example: Say you decide to change the spool file name list.procmail.spool to mail.procmail.spool, because you come to think that all your mail groups should have the same prefix "mail." in your Gnus group buffer. You already changed procmail to output to that file, so now you have two files sitting in your spool directory.

          ~/Mail/spool/list.procmail.spool
          ~/Mail/spool/mail.procmail.spool    # make sure this exists
    
    • Let Gnus read the old file as usual. Press g read new mail to list.procmail. list.procmail.spool is now be be empty and merged to nnml backend file nnml:list.procmail.
    • Make a new group with G m nnmail mail.procmail in Group buffer.
    • Go to the old list.procmail group and select all articles with M P a. Move the messages with B m to mail.procmail. You will see G marks appear to the beginning of moved articles.
    • Exit the Summary buffer and hit g to see that the messages hat were transfered to your new mail.procmail
    • Kill the old group list.procmail with G DEL
    • One more thing, remove that empty spool file. It is no more used for anything.
          % rm ~/Mail/spool/list.procmail.spool
    

    26.6 Gnus and using both nnml and nnfolder for mail reading [toc]

    The nnml backend is the easiest one to use with procmail. And it is also the fastest backend when it comes to reading the mail, so it is recommended for heavy volume mailing lists. But for private mail, I personally want to keep the mail in regular inbox format. The nnml backend stores articles into separate numbered files
          ~/Mail/mail.procmail/   --> files: 1 2 3 4 5 6 7...
    

    whereas the standard unix mbox backend nnfolder stores the mail in a single file. Don't mind that directory nnf yet, we'll come to that later.

          ~/Mail/nnf/mail.procmail    # one huge file
    

    The following explains how you get this nnfolder backend to read your mail. We suppose that you want to keep the mailbox mail.urgent in nnfolder format. After this you should have setup where you have two backends reading your incoming mail: nnml handles your mailing lists and nnfolder handles your other private mail.

    The whole procedure in nutshell

    • Create an nnfolder spool directory ~/Mail/spool-nnfolder/ In addition to ~/Mail/spool above, procmail will be writing to this directory too.
    • Create the nnfolder server "nnfspool". From the Group buffer: press ^ to enter the server buffer and use c to copy some existing nnfolder server with name nnfspool.
    • Define the server parameters so that it points to ~/Mail/spool-nnfolder/. Select e in the server buffer and change all the needed variables.
    • In the Group buffer, create new a group nnfspool:mail.urgent.
    • Get new mail for that group with g

    Also update your mail reading backend list

          (defconst gnus-secondary-select-methods
            '((nnml "") (nnfolder "") ))
    

    That's it. You can add new servers to other directories if you need to read mail from eg. you normal inbox: $MAIL

    26.7 Gnus and article snippets [toc]

    [These articles have been collected from the GNUS hypertext archive]

    I'm also a bit confused with at the proposed solution of having procmail filter incoming mail in a nnmail-procmail-directory instead.

    You have Procmail stuff mail in spool files, pre-sorted and filtered. Gnus then picks these up and stuff the messages in the appropriate groups. Gnus uses movemail to actually move the mail out of the spool, and movemail uses locking that Procmail understands, so there is no danger of mail loss.

    Why are nnfolder-directory and nnmail-procmail-directory two different directories if nnmail-procmail-directory will contain the mail boxes that procmail appends to and nnfolder-directory is supposed to be "All the nnfolder mail boxes will be stored under this directory"?

    Because Procmail should stuff its mail in different folders, not in the ones that your regular mail is stored in.

    Is the idea to have Gnus use nnmail-procmail-directory as a temporary directory that it draws from to process and then deposit nnfolder mailboxes in the nnfolder-directory ?

    Yep -- Jason L Tibbitts III (tibbs@hpc.uh.edu)


    Procmail settings
          (setq nnmail-use-procmail t)
          (setq nnfolder-directory "~/gMail/")
          (setq nnmail-spool-file 'procmail)
          (setq nnmail-procmail-directory "~/incoming/lists/")
          (setq gnus-secondary-select-methods '((nnfolder "")))
          (setq nnmail-procmail-suffix "")
    

    Procmail is adding incoming mail to ~/incoming/lists/listname. The nnfolder groups I subscribed to are named "nnfolder:lists.listname" Gnus does create the ~/gMail/lists directory with a zero length file in this directory for each list, but doesn't move any mail over and so it thinks I have "No more unread newsgroups".

          (nnmail-get-spool-files)
    

    After much experimentation, I finally got movemail to work. I changed nnfolder-directory to "~/gMail/lists/" and Gnus now moves mail from "~/incoming/lists/" to corresponding groups in "~/gMail/". My problem seems to be solved, but still these workings seem counter-intuitive to me. By what the manual has to say about nnfolder-directory I would think Gnus should build the nnfolder groups in "~/gMail/lists/" instead given my definitions.

    I think nnmail expects the spool files to be called "~/incoming/lists.whatever", not "~/incoming/lists/whatever".

          (setq nnmail-procmail-directory "~/incoming/lists/")
    

    I thought you said the groups were called "lists.whatever"? So the spool files were called ~/incoming/lists/lists.whatever.spool, then?

    26.8 Emacs GNUS - POP - Procmail [toc]

    Is it possible to get new mail via POP, run it through procmail (for quick things like trashing junk mail and archiving mailing lists) and then have Gnus do its own mail processing? This is basically what I do now with procmail in my .forward file and all output going into ~/.MailBox for Gnus to find.

    [Mark Moll (mmoll@cs.cmu.edu) 08 May 1997 ] First, let Gnus know that you're using procmail:

          (setq nnmail-use-procmail t
          nnmail-procmail-directory "~/Mail/spool/"
          nnmail-procmail-suffix ""
          nnmail-spool-file 'procmail)
    

    Second, let gnus pop your mail every 5 minutes and invoke procmail:

          (defun mm-pop-mail () (interactive)
              (call-process "/usr0/mmoll/bin/procinc"))
          (gnus-demon-add-handler 'mm-pop-mail 5 t)
          (gnus-demon-init)
    

    Finally create the following script (called procinc in the previous step):

          #!/bin/sh
          MOVEMAIL=/usr/local/lib/xemacs-19.14/lib-src/movemail
          ORGMAIL=$HOME/.newmail
          $MOVEMAIL kpop://ux2.sp.cs.cmu.edu/mmoll $ORGMAIL
          # this is copied from the procmail (1) man page:
          if cd $HOME &&
          test -s $ORGMAIL &&
          $HOME/bin/lockfile -r0 -l3600 $HOME/.newmail.lock 2>/dev/null
          then
          trap "rm -f $HOME/.newmail.lock" 1 2 3 15
          umask 077
          $HOME/bin/formail -s $HOME/bin/procmail < $ORGMAIL &&
          rm -f $HOME/.newmail.lock
          fi
          rm -f $ORGMAIL
          exit 0
    

    Instead of using a demon you can, of course, also pop your mail manually by pressing g in the Group buffer if you add the following line to your ..gnus:

          (add-hook 'gnus-get-new-news-hook 'mm-pop-mail)
    



    From: Markus Dickebohm m.dickebohm@uni-koeln.de 1997-06

    Recently I switched to procmail to filter some mails from high volume mailinglists out of my inbox (I don't like my mail notifier do blink every few seconds).

    Personal mails and mails from some low volume lists stay in /var/spool/mail/$USER.

    I set nnmail-use-procmail and both the personal mails and the procmail-filtered mails are incorporated to Gnus. That's exactly the way I like it.

    Today I started Gnus and to new nnml groups showed up. The reason was that the procmail rule produced a file "ding.spool" while the nnml group I used for this list via the nnml-split-method variable was "Ding".

    This behaviour shows that Gnus doesn't split the procmail filtered mails again. I understand the manual that the variable nnmail-resplit-incoming is responsible for that. Do I have to set this variable or is it OK to get the procmail rule and nnmail-split-method in sync?

    The manual says.. "This also means that you probably don't want to set nnmail-split-methods either, which has some, perhaps, unexpected side effects."

    This is not what I want, since the remaining mails in /var/spool/mail/$USER should be split further by Gnus. Do I really have to decide to use procmail or nnmail-split-method or is it justified to get the best from both?



    in the Info file, section `Mail & Procmail' (or so), I read:

    ... If you use procmail, you should set nnmail-keep-last-article to non-`nil', to prevent Gnus from ever expiring the final article in a mail newsgroup. This is quite, quite important.

    Why? I thought this was important only if the nnmail-use-procmail variable is nil and the .overview files are updated with a script. When nnmail-use-procmail is t and procmail writes its stuff to the spool files, (ding) knows everything about all its messages.

    ... being able to reliably deliver mail directly to (ding)'s nnmh directories, for example, using procmail would be very nice...

    As already hinted at by Per Abrahamsen this is possible as long as you don't move or copy articles (within) ding into these directories. Just set nnmail-keep-last-article to be true.

    But that's an awfully big exception to what would be a rather nice feature. Certainly filing mail into different mail groups is something I do on a regular basis. That's why I am advocatting pre- and post-hooks for all modifications to the overview/active information. With that in place, it would be possible to use a locking mechanism to prevent procmail and (ding) from both trying to modify these files at the same time. Then, copying and moving messages between mail groups during procmail deliveries would be 100% reliable.

    Unfortunately, there's no simple way to allow moves and copies into groups that have external delivery agents. The pre- and post- hooks stuff will solve the problem of safe overview / active file update. This is only part of the problem for move/copy. If an article has arrived since you last checked for new news, then ding doesn't quite "see" it (as it doesn't "see" new news until you ask it to look). What's needed here is for ding to update it's notion of what the last article in the group actually is before doing the move/copy--ie., to run a local *-get-new-news (of course, locking via a hook is still required).

    Adding this will need a lot of mucking around with the internals, the way things currently stand.

    Another approach entirely might be to wait until the stuff that was discussed for IMAP gets added--where ding asks the backend for all information and doesn't maintain any state in .gnus. It'll be simple then to make the backend check for new mail before actually copying/moving the article--ding won't have to be fooled as to what the actual article numbers are. You could add something like this right now, but I think it'll really stretch the code some. (cf. gnus-cache.el for the meaning of "stretch" :-).


27.0 RFC, Request for comments [toc]

27.1 RFCs and munged addresses [toc]

The real implementation of news software doesn't care if the from field is munged or not

27.2 RFCs and their jurisdiction (munged Addresses) [toc]

[Marty Fouts 1997-11-05 gnus.emacs.gnus] No RFC forces the address of the poster to be a reachable address (indeed, Sender: is sometimes user@host without the domain part) -- it only requires such addresses to be syntactically correct.

The RFCs do not require anything. The RFCs related to usenet are advisory . RFCs describe various things and define a small number of standard protocols, netnews is not an internet standard protocol.)

  1. Not all RFC's are standards
  2. RFC 1036 specifically states that it is not an internet standard.
  3. The wording of RFC 1036 and 822 WRT to the RFC 1036 header is ambiguous. RFC 822 _specifically_ describes the format of a mail message. It does not describe the complete format of an electronic mail address.
  4. Nowhere in 1036 is there language requiring that the address be deliverable to. Further, 822 provides language that would allow for a valid but not deliverable address to be acceptable. [822 doesn't describe addresses, it describes _mailboxes_, which are something similar but not identical.]

    The bottom line WRT RFCs that are informational is that when there is an ambiguity, or a difference between the RFC and the implementation, the implementation (which is what the RFC was trying to describe in the first place) has precedent.

    As much as y'all want it to be otherwise, the implementation of netnews, (I. E. INND, NNTP) doesn't care about whether or not an address can be replied to. It is rumoured that some news posting software checks the validity of an address. Such software is in a tiny minority.

    netnews is a public forum. mail is a private communication medium. Posting in a public forum does not require that I give you access to my private address, just as speaking at a public meeting does not require that I give you my unlisted phone number.

    [Munged From address]

    One thing is for certain: putting the burden on anyone wishing to send an email to you, by requiring them to decipher the address. Someone may I never "reply by mail" to persons using those phony addresses. Anyone who wishes to send a personal email cannot just hit 'reply'. People who do this accept this, which is they will watch the newsgroups for followups regularly. If someone eagerly wants to get personal, he can spend the extra minute to decipher the correct address for the person.

    [Counter argument:] When I was using Pegasus Mail (Win95), it took me about 10 minutes to set up filters that removed over 75% of the spam I received. 10 minutes is too great a burden to you? MY, what a busy person you are.

    [Timothy J Luoma luomat+procmail@luomat.peak.org] What about the accounts from which I do not control (network at work) where I do not have say over what software is installed? I can say to the sysadmin ``Hey I'd like Pegasus mail installed'' and he nods and mumbles something. He's got 2 years worth of backlog from there not being a real sysadmin around

    Furthermore, there are a number of procmail recipes available on the net, that can be used with minor adjustments to filter your mail. No heavy-duty unix skills are required. Just the initiative to take responsibility for your own problems.

    I know procmail very well, and spammers are still getting through. You know why? They refuse to follow all the conventions we depend on. And they spam mailing lists, so I have to filter for that as well.

    I have spent untold hours trying to develop better and better filters with lower numbers of mis-hits. Nothing works as well as not giving more spammers my address. You simply prefer to put the problem off on somebody else, rather than take the time to deal with it yourself. Well, that kind of laziness does seem to predominate in the "world of the internet" these days.

    I have spent the time, learning from what others have done and seeking to improve them. You are certain you are right and refuse to think about it anymore.... and that kind of laziness is all over the Internet.

    The only one it wrongly inconveniences are those who need to email me and have lost my email address. If you want to followup a Usenet post, do it in Usenet. I'll be back here for followups. I get enough email, and don't need email for Usenet threads.

    If you would like me to use a real address, please set me up an account with procmail where I can get all my Usenet related messages sent.

    27.3 RFC and valid email address characters [toc]

    What characters are legal in e-mail addresses? So far, I have uppercase, lowercase, digits, _ - + . @

    [elijah] Most any 7bit character. For all practical purposes whitespace (space, tab, newline) are really inadvisable. This post is from a valid address. I also have ones with control characters -- eg <@qz.to> (may not show up right in your newsreader). See RFC822 for the full rules on generating an address, but the quick and dirty thing is any of the "specials" must be quoted to be used.

          Se definition of specials in RFC
          specials    =  ()<>,;:\.[] and a double quote
    

    If you don't believe me, there are mail toys to prove this. Best one I know of right now is Tom Phoenix's "fred&barney"@redcat.com address. You can replace the "&" with just about any string I believe. I've tried it with stuff like "fred($)barney"@redcat.com and it seems pretty stable.

    27.4 RFCs and message's signature [toc]

    According to universal defacto Net convention, there must be "\n-- \n" before signature. The extra space in signature delimiter tells that it is user's messages and not the Message Digest that uses delimiter "\n--\n". There is no RFC that would address this though.

    And by the way: it's rude to have longer sig than 1-3 lines. Better yet, move the repetitive information to the X-headers if your MUA supports modifying the headers.

    NOTE: The choice of delimiter is somewhat unfortu- nate, since it relies on preservation of trailing white space, but it is too well-established to change.

    [Paul O. Bartlett pobart@access.digex.net] Eg. When one is writing text, the preferred Un*x editor routinely truncates trailing blanks when writing a file, so that even if there were "-- " in the signature, Pine includes it automatically as part of the editable
    text, and the editor would simply truncate the blank. The signature delimiter may be "too well-established to change," but it collides with the reality of the tools people use.

    27.5 Some RFC Pointers [toc]

    http://www.cis.ohio-state.edu/hypertext/information/rfc.html
    • RFC822 Format of internet messages (formerly called as Arpanet)
    • RFC1036 (the email message format standard: From, to, date ...)
    • RFC1153 Digest message format, 1990, Status: EXPERIMENTAL)
    • RFC1855 Netiquette Guidelines
    • RFC1991 PGP Message Exchange Formats
    • RFC2076 Common Internet Message Headers
    • RFC2045,6,7 - 7 MIME
    • RFC2111 Content-ID and Message-ID Uniform Resource Locators

28.0 Introduction to E-mail Headers [toc]

28.1 Lecture [toc]

[9 Aug 1996 Alan K. Stebbens in procmail mailing list]

There are two general classes of headers: those generated automatically by the MTA, and those configured and inserted by the MUA, on the user's behalf.

The former, the ones generated by the MTAs, are used mostly for tracking the e-mail, and generally have nothing to do with the content of the email, much like those bar-code labels FedEx uses to track packages.

The latter, the ones inserted by the MUA or by the user, are just like the shipping label the FedEx customer fills out, ie: they determine the source, the destination, and describe the content of the mail.

It would be overburdensome for the user to generate all of these MUA headers themselves, so the user's mailer generates many or most of them automatically, typically under configuration control. Of course, the user can always override or replace the automatic MUA headers.

The MTA headers, on the other hand, are almost completely automatic and the user almost never can change them. Only under special circumstances should the MTA headers be inserted or modified by the user.

>From the user's perspective, however, the e-mail process seems atomic, so that the distinction of these header classes is lost. Even some systems managers or postmasters fail to appreciate that it is during different stages of the e-mail process, that different sets of headers get inserted.

To help clarify this distinction, here's a diagram of the e-mail process and its several stages:

      sender -> MUA -> MTA ->..-> MTA -> MDA ->{maildrop}-> MUA -> reader
      [1]       [2]    [3]        [4]    [5]                [6]

Headers typically provided by "template" by the MUA to the sender, usually during stage [1] (when composing e-mail):

      From:               # who I am
      To:                 # the target
      Cc:                 # people to keep informed, but need not respond
      Bcc:                # secret admirers
      Subject:            # what's the mail about
      Reply-To:           # highest priority return address
      Priority:
      Precedence:
      Resent-To:          # used for redirecting e-mail
      Resent-Cc:
      X-BlahBlah:         # personalized headers

When the sender is done composing, and says "send it" to his/her mailer, some additional headers may get inserted by the MUA at this stage [2]:

      Date:
      Resent-Date:        # if being redirected
      From:               # If not already present
      Sender:             # if a From: is already present
      X-Mailer:           # what MUA composed this message
      Mime-Version:
      Content-Type:       # what kind of stuff is in here
      Content-Transfer-Encoding:
      Content-Length:

When the MTA receives the e-mail from the MUA at stage [3], it may insert additional headers showing the origination of the e-mail:

      From                # if local e-mail, automatic or by -f option
      Date                # If not already present
      Message-Id:         # unique ID for the e-mail; the first MTA
                          # creates this
      Received:           # shows inter-system e-mail tracking info
      Return-Path:        # shows how to get back to the sender

As each MTA hands off the e-mail, additional headers may get added, all as part of the MTA to MTA handoff in stage [3]:

      Received:           # inserted by each MTA

As the final MTA hands the e-mail to a delivery agent (MDA), in stage [4], there are still some more header insertions which may occur:

      Apparently-To:      # added if no To: header exists
      From                # may get added if local e-mail

Some sites insert special rewrite rules and filtering to occur to support virtual domains, and these header changes will occur at stage [5], just before the incoming mail is dropped. Generally, though, no new headers are added, except possibly one to avoid loops:

      X-Loop: $USER@$HOST # inserted to avoid filtering loops

Finally, at stage [6] when the reader views his/her e-mail, most MUAs will apply a filter to the stored mail causing selected headers to be omitted from the display. In a sense, then, this filtering "removes" the headers from the user's view (although no headers are actually removed by the MUA).

The headers typically omitted are those inserted by the MTAs, and those having to do with the transport process and less with the contents.

28.2 Applied to received messages [toc]

So, now that we have a common understanding...

The first "From" is a Unix-mail "From " header (note the space). This is inserted automatically by MTAs, unless one is already present and only then if it seems valid.

The second "From:" is generated by the MUA (your personal mailer), either by configuration, or by the user. The rewrite rules in sendmail and most filtering programs concern themselves with the "From:", "To:", "Cc:", "Reply-To:" headers.

I'll assume that if "From smmi" is not "correct", then you must be trying to hide the delivery process, and implementing something of a virtual domain.

In general, it is a bad idea to "correct" the automatic mail headers inserted by the MTAs. This is a different matter than changing addresses to show virtual domains. The "From " header is part of the history of the message, showing how the mail was originated. Similarly, the "Received:" headers should not be messed with. Changing the history of an e-mail message will make it very difficult to diagnose e-mail delivery errors.

That being said, and, since I also believe in the freedom of choice, I will now supply you with "enough rope to hang yourself" :^)

There are two places where you can have the "From " header corrected: just before it gets dropped into the mailbox (for incoming e-mail), or as it gets submitted to the MTA (for outgoing e-mail).

Changing the "From " before it gets dropped is easy. Just use a recipe like this:

      FROM    = `$FORMAIL -zxFrom:`
      DATE    = ...construct the RFC date format
      :0 fhw
      | $FORMAIL -I "From  $FROM $DATE"

The "From " header is created automatically by the MTA (sendmail) when it receives a piece of mail. If the mail is sent through sendmail without using the '-f' option, then sendmail sets the default "From " to that of the current user. If you are not root, or a "trusted user" (see the sendmail man page), then sendmail will ignore the "From " header and either remove it altogether or replace it. Even if you are root, sendmail will replace the "From ", if the e-mail is being received locally (as opposed to from the network).

If you wish to change the "From ", you must invoke sendmail, as root or a "trusted user", and use the "-f" option. EG: to set the "From " to match the "From:" header, use the following recipe, as root:

      FROM = `$FORMAIL -zxFrom:`
      :0
      ! -oi -t -f"$FROM"

Please read the man page on sendmail, noting the use of '-f'.

28.3 Bcc lecture by Alan Stebbens [toc]

Procmail most typically processes incoming email at a destination site; the BCC formatting (or lack of it) is done on outgoing email, at the originating site.

For this discussion, let's make distinctions as to the kinds of mail there are: (a) incoming mail, and (b) outgoing mail. Bcc's are inserted into outgoing mail by the user, and the message is then handed to a MUA. The MUA may then handle the BCC's or defer that to the Mail Transport Agent (MTA), such as sendmail. Whichever agent performs the Bcc function, that function is performed in at least three different ways:

      ------- Blind-Carbon-Copy
      ...
      ------- End of Blind-Carbon-Copy

The original email standard RFC822 says this about Bcc:

4.5.3. BCC / RESENT-BCC

This field contains the identity of additional recipients of the message. The contents of this field are not included in copies of the message sent to the primary and secondary reci- pients. Some systems may choose to include the text of the "Bcc" field only in the author(s)'s copy, while others may also include it in the text sent to all those indicated in the "Bcc" list.

So, procmail would handle Bcc's correctly if the sender's MUA included the Bcc in the header in the first place. But, since procmail is most typically used on incoming email, it will never have a chance to deal with Bcc: headers.

28.4 Bcc lecture by Philip Guenther [toc]

The Bcc: header should in general not appear in an incoming message (if procmail is used for processing outgoing mail it may occur there). Most (?) Mail User Agents will send a bcc by just removing the header entirely and putting the address in the envelope recipient list with the other recipients from the To: and Cc: headers. Done this way, the address to which the message was bcc'ed *does not occur in the headers at all*, and you are SOL.

By the time procmail is run (in the standard installation), the envelope is lost, which is the only way you would be able to process Bcc's with any possible regularity, and even that's suspect as if an alias at another site that contains your address is bcc'ed, then the envelope, by the time it reaches you site, will only contain your (local) address.

Furthermore, the whole point of the Bcc: header is that the people who receive the message do not know the entire list of address to which the message was sent. If an alias is bcc'ed, it is not clear whether the members of the alias should know that it was the alias that was bcc'ed and not just the individual in question alone.

There MUST be some trace of the BCC destination that travels with the e-mail. Otherwise, how does it know its destination? If I'm right, then couldn't procmail use this to properly handle the message?

[alan] Only the MTA knows the destination address because it is part of the "envelope", the information which is passed on the "RCPT To: some-user" SMTP line. This information is how the MTA knows to deliver the mail, and not by the contents of the headers.

Of course, when invoked properly, many MTAs can read the headers to obtain the addresses needed on subsequent "RCPT" commands in the ensuing SMTP connections. In fact, the Bcc: header can be read along with the rest of the destination headers to obtain the recipient addresses, but the Bcc: will also be removed from the headers.

The address by which an MTA receives a mail is known as the "envelope address", which may be distinct from any headers in the message itself, or, the same as one of them, for directly addressed mail.

With mailing lists, for example, the addressee will never see his/her own address, but will see the mailing list in the To: or Cc: header fields. Even here, when mail is addressed to more than one mailing list, there is a lack of standard for determining the address by which a message is received. There are lots of conventions followed, and heuristics, but no clearly defined standard to indicate the cause of delivery.

You may be able to configure your MTA to pass along the envelope in a new header, or pass it by argument to the local delivery program (which can be procmail). It is then up to the local delivery program to use (or not) the envelope address information.

If you wish to understand the limits of your mail system, you should read RFC822 (email formatting standards) and RFC821, which describes the original language of SMTP. There are several extensions in progress, but the basic commands of "MAIL", "RCPT", and "DATA" should suffice.


29.0 Message's headers [toc]

29.1 What's that X-UIDL header? [toc]

[phil]

[David] The advisability of trashing all mail with X-UIDL: headers has been discussed on procmail list recently; apparently it's possible for one to appear in legitimate mail.

[Elijah] Yup. Very true. Mostly likely case would probably be for certain types of forwarded mail, including some moderated mailing lists. Fluffy's mod.* list had these until I pointed out the wide-spread file-to-/dev/null problem to Fluffy.

29.2 From_ is the envelope sender [toc]

[phil] the address on the "From " line is the envelope sender. If the message has a Return-Path: header, then it would probably be easier to use that instead, as then you don't have to deal with the date as found at the end of the "From " header.

DON'T CONFUSE THE ENVELOPE WITH THE MESSAGE. The headers in the message are allowed to contain a list of address in the To: and Cc: headers that are totally irrelevant to where the message it going. For example, a message from a mailing list may simply say "To: procmail@Informatik.RWTH-Aachen.DE", with no visible sign that "guenther@gac.edu" is an address to which the message is being delivered. That information, where the message is currently in the process of being delivered to, is found ONLY in the envelope. Okay, where is this precious envelope? In SMTP the envelope consists of the MAIL FROM: and RCPT TO: SMTP commands. However, when a message is given to the local mailer, this information is typically lost. Well, the envelope sender is usually saved now days in the Return-Path: header, but the envelope recipient usually only appears in the form of the login name that the local mailer was passed on the command line. This can be used, for example, by /etc/procmailrc scripts that check $LOGNAME to see where the message is set to go.

A problem arises however when people start creating virtual domains. When sendmail does the aliasing (usually by mailertable I believe?), it totally loses the original envelope recipient address in the rewriting. All the addresses get rewritten to the same thing, and sendmail thus has no reason to differentiate them. Having lost their independent identities, the now-same multiple recipients are merged to form one call to the local mailer.

The key point here is that once the envelope recipient is lost by the virtual domain alias, THERE IS NO WAY TO GET IT BACK! You can wave your hands and try faking it, but no one in the virtual domain can ever get onto a mailing list or otherwise receive a piece of mail for which the header doesn't explicitly contain his/her email address. And furthermore, even doing that faking is extremely difficult to do right. What I show below does NOT correctly handle messages with Resent-* headers. This can result in messages being received by people who shouldn't receive them, possibly violating someone's privacy. Please keep all that in mind if you decide to use it. It handles a goodly percentage of the cases, but it'll bite you badly at some point in the future.

So you may ask, does this mean that virtual domains are hopeless? The answer is no, you just have to be very careful in the sendmail.cf to keep the envelope recipient stashed somewhere long enough that it can be passed as an argument to the local mailer, usually by putting it in the 'host' part of the mailer triple, though with sendmail 8.7.x, putting it into the local part with a '+' would probably be incredibly clean. In the end, it ends up being passed to procmail (standard /bin/mail has no way of handling this, but we already knew that) as another argument (i.e., -a orig-envelope-recip), though with some work it might be possible to do it via a new header, but that's uglier and no more efficient. I don't have the sendmail.cf (or m4 .mc) mods necessary to do this, but if you post to comp.mail.sendmail (after checking the FAQ, I think it might be there) someone may be able to give you further pointers on saving envelope recipients in virtual domain situations.

29.3 Message-Id [toc]

Are there known problems with "valid" mails with illegal MessageIDs? For some strange reason, some people are sending out email with bad message id's. That wouldn't be much of a problem, except that our MITS department won't even consider fixing the bad-message-id unless it causes a problem somewhere else.

Why would they not consider fixing it? Their e-mail software/gateway is broken, and needs fixing. That's that. Direct them to RFC 822, sec 4.6.1. http://ds.internic.net/rfc/rfc822.txt [Gerald Oskoboiny gerald@impressive.net] There are problems with Some of the problems with mail containing a bad message id

Some people (myself included) run filters to automatically delete incoming e-mail if its message-ID has been seen recently, or if it looks bogus.

Some mailing list software (including Smartlist) does not accept e-mail with a message-ID that has been seen recently. Each message must have a unique message-ID. The best way to ensure that msgids are unique in a global context is to include a fully-qualified domain name after the '@'. In particular, a message-ID like <3.0.5.32.19971208192547.007db100@mailhub > is unacceptable for this reason (even if it didn't have a space at the end.)

Some mail archive software (including some that I wrote) uses message-IDs as a unique identifier for that message in the archive. It may reject messages that appear to be duplicates because they have a message-ID used by other messages. (as my software does.)

29.4 X-Subscription-Info [toc]

This is a header that is used by some mailing lists: it contains an email address for un/subscribe, or a URL with said info. Imagine the reduction in bozo messages asking how to unsubscribe from mailing lists. If your mailing list doen't have it already, make a suggestion to the list's maintainer.

29.5 Reply-To header [toc]

The existence of a Reply-To: means, "IF you reply to me, send it to this address instead of the one in the From: header."

In the case of a mailing list, the list usually is that default mailbox. In that case, a Reply-To header says, "don't send it to the list, send it here instead." Again, it is more a matter of "do what I mean".

"ListAdmin: Don't play with Reply-To" ... RFC-822 on reply-to is just almost hopeless. The reason people do what they do is more likely because they saw someone else doing that, and imagined it was correct, and copied - perhaps slightly varying things along the way.

...If you use a reasonable mailer, Reply-To munging does not provide any new functionality. It, in fact, decreases functionality. Reply-To munging destroys the reply-to-author. capability. http://www.unicom.com/pw/reply-to-harmful.html

"Reply problems" http://www.cs.utk.edu/~moore/reply-problem-list.txt

...there are useful things that can be done with these headers. For instance -- on mailing lists where everyone that posts is assumed to be subscribed (like this one), the listserv could add a "Mail-Followup-To: ding@gnus.org" header. It can also be used by the sender as a way to signal "I am subscribed to the list; don't Cc me or anybody else". ftp://koobera.math.uic.edu/www/proto/replyto.html

Keith Moore moore@cs.utk.edu Wed, 11 Feb 1998 14:20:25 -0500 commented on the nmh list. Keith is the IETF applications area director, and used to chair the DRUMS working group.

Please don't implement support for Mail-Reply-To and Mail-Followup-To in nmh. Not only are they nonstandard, they're a poor fix for the problem. Reply-To is widely misinterpreted as the replacement for the From field in replies, in such a way that "reply all" goes to Reply-To + To + Cc if Reply-To is present and From + To + CC if no Reply-to field is present.

RFC 822 has language that appears to support this view. But a careful reading of RFC 822 reveals that this prose does not apply to Reply-To with respect to a "reply all" function, but only with the use of Reply-To in a "reply to author" function.

This leaves us with the situation where the author of a message is unable to specify the complete destination for replies. Even if the author specifies a Reply-To field, if the recipient uses "reply all", addresses from the To and CC field are still included. This is the behavior implemented by almost every UA in existence, but it's almost always the wrong thing to do.

And RFC 822's examples make it clear that Reply-To is intended as the complete destination for replies, not merely a replacement for the From field. The right way to fix this is to correctly interpret Reply-To - not as simply the replacement for the From field in replies, but as the reply destination preferred by the author of the subject message. Adding new headers doesn't fix the problem. It only makes the situation more complex.

Dan's proposal is intrinsically flawed. It incorrectly assumes that the sender can reasonably anticipate the recipient's needs in replying to the message, and that such needs can reasonably be lumped into either "reply" or "followup". It doesn't solve the real problem, which is that responders need to think about where their replies go. Mail-Followup-To won't decrease the number of messages that go to the wrong place.

If I sent out a message inviting people to a meeting, and want "normal" replies (presumably accepting or declining the invitation) to go to my secretary. Should I put my secretary's address in "Mail-Reply-To" or "Mail-Followup-To"?

Say I put it in Mail-Reply-To and a responder wants to send a personal reply to me, perhaps because it's sensitive in nature. So he hits "reply to author" thinking that the message will go to me. Instead, the message goes to my secretary. This is Bad.

Say I put my secretary's address in Mail-Followup-To and a responder wants to send a message to the list of recipients of the original message -- maybe that responder wants to let everyone know about cheap airfares to the meeting. So the responder hits "reply to everyone" thinking that the message will go to everyone. Instead, the message goes to my secretary. This is not as bad as the other case, but it's still not desirable.

So if some responses are neither "personal" nor "group" replies, why not define an extensible reply header that would include not only the address but the category of reply? Something like:

Labelled-Reply-To: secretary; jeeves@cs.utk.edu Labelled-Reply-To: mailing-list; listname@foo.com

It turns out that we already have most of this in RFC 822:

      Reply-To: (my secretary) jeeves@cs.utk.edu
      Reply-To: Secretary: jeeves@cs.utk.edu ;,
           The Gang: a@foo, b@bar, c@zot ;

(Unfortunately, phrases are so widely botched, that they probably aren't usable for this.)

Summary:

Stainless Steel Rat ratinox@peorth.gweep.net 1998-02-12 commented in Emacs ding mailing list

Every mail client is not doing supporting this. Only the badly written ones fail to distinguish between replies and followups.

When you get right down to it, this proposed standard has two goals:

  1. To make broken MUAs act less brokenly. Well, broken MUAs are not going to implement this standard, anyway; good MUAs do not need it as they already make the distinction between replies and followups.
  2. To make broken mailing lists act less brokenly. Administrators of broken mailing lists have decided that they like it that way. They claim that it makes it easier for their lists' subscribers to reply to the list. The subscribers that "need" list-bound Reply-To headers are using broken MUAs. See #1.

    This proposed standard will not solve any of the problems it attempts to address. It creates headers that are ignored by bad MUAs and are redundant for good MUAs.

    To summarise Keith's statement: From is the originator's mailbox. It is not an 'account'. RFC 822 states that the originator header should contain the correct default reply address.

    This is the scenario that the proponents of these headers have proposed, and the flaw the IETF has found with it.

    Joe is subscribed to a mailing list that he reads from his "private" mail account. For whatever reason, Joe posts a message to that list from work, so his work mailbox is in the From header. Joe does not want to override where responses go with a Reply-To header, but he wants personal replies to go to his private mail account insteaed of his work account.

    The flaw the IETF found is that Joe is equating his two mailboxes with his private and work accounts. There is no such correspondence as far as RFC 822 is concerned. If Joe is acting in a "private" fashion, the system he is using is irrelevant; his private mailbox belongs in the From header and he should put that mailbox there when he originates the message, regardless of where he physically is when he does so.

    29.6 Mail-Copies-To header [toc]

    [Suggested by Lars, the Author of Emacs Gnus]

    ...Mail-Copies-To: is a header line used in messages on Usenet to direct copies by email of followups to posts. http://www.math.fu-berlin.de/~guckes/rfc/mail-copies-to.html

    [SL Baur steve@xemacs.org] The Mail-Copies-To: header should control how your email (and Usenet) client prepares a followup message. It gives control to the sender of a message whether courtesy duplicate copies of messages should be sent. There are two forms:

          Mail-Copies-To: never
    

    Do not automatically include the sender of the message being responded to. There are two canonical examples.

          Usenet:
          From: foo@foo.bar
          Newsgroups: comp.emacs.xemacs
          Mail-Copies-To: never
    

    A followup in a conforming client should generate in the response message headers:

          Newsgroups: comp.emacs.xemacs
    
          Email:
          From: foo@foo.bar
          To: mailing-list@somewhere.com
          Cc: luser@somewhereelse.com
          Mail-Copies-To: never
    

    A followup in a conforming client should generate in the response message headers:

          To: mailing-list@somewhere.com
          Cc: luser@somewhereelse.com
    

    The second form includes a properly formed RFC822 email address as the parameter:

          Mail-Copies-To: someaddress@somewhere.com
    

    In this case, the sender of the message is specifically requesting that responses to the message not only go to the main forum (either mailing list or Usenet newsgroup), but a duplicate copy should also be sent to someaddress@somewhere.com. There are (again) two canonical examples.

          Usenet:
          From: foo@foo.bar
          Newsgroups: comp.emacs.xemacs
          Mail-Copies-To: foo@foo.bar
    

    A followup in a conforming client should generate in the response message headers:

          Newsgroups: comp.emacs.xemacs
          Cc: foo@foo.bar[1]
    
          Email:
          From: foo@foo.bar
          To: mailing-list@somewhere.com
          Cc: luser@somewhereelse.com
          Mail-Copies-To: foo@foo.bar
    

    A followup in a conforming client should generate in the response message headers:

          To: mailing-list@somewhere.com
          Cc: luser@somewhereelse.com, foo@foo.bar[2]
    

    There is no requirement that the address in Mail-Copies-To match the From address. Footnotes: [1] Or `To: foo@foo.bar' [2] It is also acceptable to put foo@foo.bar in the To: line.

    29.7 Mail-Followup-To and Reply-To-Personal headers [toc]

    [21 Nov 1997, Mutt Development List <mutt-dev@cs.hmc.edu]

    Jacob Palme just today submitted an Internet-Draft describing Mail-Followup-To. Jacob, the Working Group chair Chris Newman and I all regard this as complementary to my own Reply-To-Personal proposal, an early version of which I posted here and which was also submitted as an Internet-Draft just today. In fact had me week been a bit less harried Jacob and I would have issued a joint draft. Within a few days you should be able to view these drafts in the IETF drafts directory on ds.internic.net under the names

    draft-ietf-drums-mail-followup-to-00.txt Jacob Palme's draft on the proposed Mail-Followup-To header.

    draft-ietf-drums-replyto-personal-00.txt My draft on Personal-Reply-To

    29.8 Content-Length header and From_ specification [toc]

    [1996-05-17 From: Jamie Zawinski jwz@netscape.com comp.mail.headers]

    ...I'm not saying that the BSD Mailbox format is good. Just that the Content-Length variant of that format is worse.

    Ok, so someone took the From_ format, and extended it to not require mangling by adding a length indicator to the format. At first glance, this may sound simple and elegant, but it breaks the world, and one shouldn't encourage its use to spread.

    The thing that breaks is taking an existing, widely-implemented format, and adding a requirement that it have a length indicator. This means that any existing software that already thinks it knows how to manipulate that format is going to damage the file (any change to the data will cause the length indicator to be wrong with respect to the new specification but not with respect to the old specification.)

    If the content-length-based format was not otherwise- indistinguishable from the ``From '' format, there wouldn't be a problem; the old software would simply fail to work with this new file format, instead of `corrupting' the documents (in quotes, because it's really just a matter of which spec you're following.)

    Also, mailboxes are by their nature a textual format; but, the content-length header measures in bytes rather than lines. This means that if you move the file to a system which has a different end-of-line representation (Windows <=> Mac, or Windows <=> Unix) then the content-lengths will suddenly be wrong, because the linebreaks now take two bytes instead of one, or vice versa.

    It's impossible for a mail client to look at a file, and tell which of the two formats (From_ or Content-Length) it is in; they are programatically indistinguishable. The presence of a Content-Length header is not enough, because suppose you were on a system which knew nothing at all about that header, and some incoming message just happened to have that header in it. Then that header would end up in your mailbox (because nobody would have known to remove or recalculate it), and it would possibly be incorrect. (Presume further that the header was not just incorrect, but intentionally malicious...)

    Stricter parsing of the ``From '' separator line doesn't help either, because there are many, many variations on what goes in that line (since it was never standardized either); and also, some mail readers include that line verbatim when forwarding messages (Sun's MailTool, for example) so a stricter parser wouldn't help that case at all, because message bodies tend to contain valid matches.

    Some mail readers attempt to cope with this by recognizing the case where the Content-Length is not obviously spot-on-target, and then searching forward and backward for the nearest message delimiter; but this is obviously not foolproof, and makes one's parser much more inefficient (requiring arbitrary lookahead and backtracking.)

    Conventional wisdom is, ``if you believe the Content-Length header, I've got a bridge to sell you.''


30.0 MIME tags [toc]

30.1 MS Exchange application/ms-tnef [toc]

A member of one of my mailing lists appears to be using Microsoft Mail. His messages to the list are usually accompanied my an encoded attachment like this one: "c:\eudora\users\steven@idma.com\attach\WINMAIL11.DAT" The message headers include the following clause: Content-Type: multipart/mixed; boundary="openmail-part-058c9f3d-00000001" This is driving people crazy. What is causing this and is there any way to make it stop?

Most likely the sender is using Exchange (or Windows Messaging or Outlook97) and sent the messages in Rich Text Format. It puts the RTF message in an attachment called WINMAIL.DAT (application/ms-tnef). But this attachment is useless unless the recipient is also using Exchange.

The sender can turn off the RTF option for messages to you. For more information, see: XCLN: Sending Messages In Rich-Text Format http://support.microsoft.com/support/kb/articles/q136/2/04.asp


31.0 Jokes [toc]

31.1 The ultimate spam filter [toc]

[idea by david] You absolute want to get rid of annoying email messages? Right, you want the ultimate spam recipe which stops all UBE. Guaranteed, handed to you by "stop-it-all"...
      :0
      /dev/null

32.0 Other Code [toc]

32.1 Email related code [toc]

"Perl ifile" ...ifile is different from other mail filtering programs in three major ways: 1) ifile does not require you to generate a set of rules in order to successfully filter mail 2) ifile uses the entire content of messages for filtering purposes 3) ifile learns as you move incorrectly filtered messages to new mailboxes ifile is not dependent upon any specific mail system and should be adaptable to any mail system which allows an outside program to perform mail filtering. Currently, ifile has been adapted to the MH and EXMH mail systems. http://www.cs.cmu.edu/~jr6b/ifile/ Jason Daniel Rennie jr6b@andrew.cmu.edu

32.2 Expire mail code pointers [toc]

"Perl mail expire" ...This program removes old messages from system mailboxes. It assumes the format of mailboxes to be standard sendmail format mail with a blank line followed by a `From ' line starting each and every message. Mailbox locking is via flock. Works under SunOS. http://www.oasis.leo.org/perl/scripts/net/mail/expire_mail.dsc.html
Phil Male phil@compnews.co.uk

32.3 Html layout of the Code sections [toc]

When you view this document in html format, which is automatically generated from text file, the code layout may not be precisely as what it is in the original pm-tips.txt . You may see strange indentation, strange colors, emphasised text, different character types etc. So,

PLEASE DO NOT COPY THE CODE FROM the .html PAGE, BUT FROM THE ORIGINAL TEXT FILE ftp://cs.uta.fi/pub/ssjaaa/pm-tips.txt

32.4 Perl Extract procmail man pages from procmail-3.11pre7.tar.gz [toc]

  #!/usr/local/bin/perl
  #
  # @(#) Perl pm-man.pls -- Make procmail man pages from the tar.gz kit
  #
  #   File id
  #
  #       $Contactid: jari.aalto@poboxes.com $
  #       $Docid: 1997-10-11 Jari Aalto $
  #       This code is free software in terms of GNU Gen. pub. Lic. v2 or later
  #       File server info: Send subject "send help" to Contactid.
  #
  #   Description
  #
  #       The procmail-3.11pre7.tar.gz has some .man files that can't be used
  #       right. You have to run the whole make process before you can get
  #       the ready man pages. However, this small perl script takes care of
  #       only creating the man pages if that is the only thing you want to
  #       grab from the tgz kit.
  #
  #   Usage
  #
  #       Be in the procmail tar.gz kit's "man" directory and run this script
  #       with the command
  #
  #           % pm-man.pls *.man
  #
  #       And you will have the *.1 pages in the directory. You can view
  #       the man pages with
  #
  #           % nroff -man *.1 | less
  #
  #       And you can make text only pages with the following command.
  #
  #           % sh -c 'for f in *.1; do nroff -man $f| col -bx > $f.txt; done'
  #
  #   Change Log (none)

# The Procmail arg definitions are in config.h, get them to %def hash

$f = "../config.h";

print "Reading definitions from $f\n"; open(F,"$f") || die "$f: Can't open [$f]"; while ( <F> ) { $def{$1} = $2 if /#define\s+([\w_]+)\s+['\"]([^'\"]+)/; $def{$1} = $2 if /#define\s+([\w_]+)\s+(\d+)/; } close F;

# This is hard coded, can't be read from config.h without # proper cpp and the flags. Define this if you need. # $def{'FM_BERKELEY'} = 'Y';

$len = @ARGV; $i = 0;

!@ARGV && die " Mmm, please give *.man files to convert";

for $manfile (@ARGV) { $i++; ( $to = $manfile ) =~ s/\..*/.1/; # Change suffix .man --> .1 print "Making man page $to ($i/$len) ...\n";

# Ignore safety measures: .ex is nroff's exit code # open(F,"$manfile") || die "$f: Can't open [$manfile]"; @content = grep( ! /README|\.ex/, <F>); close F;

# Now expand symbolic names in the man page # @MAILFILTOPT@ --> ... # for ( @content ) { for $label ( keys %def ) { s/@$label@/$def{$label}/g; } # s/@(.*)@/$1/; # Treat unknown literals "as is" }

# Write the ready man page #

open(F,">$to") || die "$f: Can't open [$to]"; print F @content; close F; }

# End of pm-man.pls

32.5 Sh remove matching lines from file [toc]

[era] The name "gred" is rather obscure if you don't know what "grep" stands for. Anyway, this is also really too specialized a script to get such a general-sounding name. Incidentally, I timed this against a Perl one-liner on my /usr/dict/words (which is rather small, though; some 25,000 lines) and found the shell version to be quicker, even with the locking. A good citizen would take care to remove the temp files when done, but since Wotan thought it would be valuable to keep them around for backup, I left implementing that as an exercise. (Hint: they should be cleaned up even if the script is interrupted with ctrl-C.)
      #!/bin/sh
      # gred -- like grep, but remove matching lines
      # syntax: gred regex file
      # locks file while gredding using dotlocking
      #
      case "$#" in
          2) ;;
          *) echo "Syntax: gred regex file" >&2 ; exit 1 ;;
      esac
      LOCK="$2.lock"
      TMP=/tmp/$$.temp
      if lockfile "$LOCK"; then
          mv "$2" "$TMP"
          grep -v "$1" "$TMP" >"$2"
          rm -f "$LOCK"
      fi
      #
      # end of file

32.6 Sh expire mail [toc]

William Avery wravery@wravery.student.princeton.edu

I wrote a shell script awhile ago to do something like this using formail. It's probably not the most efficient/effective way to do it though, so I'd welcome any feed back on it. For one thing, now that I look at it again, and after reading this list for awhile, I'm not sure that I'm getting the date of the message very reliably.

# expire_mail: Written by William Avery, March 1996

# This script will delete messages older than $AGE days from the # mailbox specified on the command line. It requires that you # have formail installed on your system, and if formail is in a # directory other than /usr/bin, you must change the value of # $FORMAIL below.

TEMPMSG=message.tmp TEMPMBOX=mailbox.tmp AGE=5 FORMAIL=/usr/bin/formail

if [ ! -x ${FORMAIL} ]; then echo This script requires ${FORMAIL}, which comes with procmail. exit 1 elif [ ! "${1:-}" ]; then echo "Usage: $0 <mailbox path>" exit 1 elif [ -f "${TEMPMSG}" ]; then echo "${TEMPMSG} will be destroyed." echo -n "Continue? (y/n): " read RESPONSE case ${RESPONSE} in [yY]*) echo Continuing.... ;; *) echo Good Bye. exit ;; esac fi

if [ "${1}" != "select" ]; then if [ -f "${TEMPMBOX}" ]; then echo "${TEMPMBOX} will be destroyed." echo -n "Continue? (y/n): " read RESPONSE case ${RESPONSE} in [yY]*) echo Continuing.... ;; *) echo Good Bye. exit ;; esac fi rm -f $TEMPMBOX ${FORMAIL} -s $0 select < $1 mv -i $TEMPMBOX $1 exit fi

TODAY=`date -d "\`date\`" +"%Y%j"` EXPIRATION=`expr ${TODAY} - ${AGE}`

cat > $TEMPMSG

MESSAGE=`sed -n -e 's/^Date:\(.*\)$/\1/p' ${TEMPMSG} | head -n 1` MESSAGE=`date -d "${MESSAGE}" +"%Y%j"`

if [ $MESSAGE -ge $EXPIRATION ]; then cat $TEMPMSG >> $TEMPMBOX fi rm -f $TEMPMSG

32.7 Gawk expire mail [toc]

by Roman Czyborra czyborra@cs.tu-berlin.de.

#! /usr/local/bin/gawk -f # using GNU version 2.15 of awk: This filter deletes all messages # older than expire days from a Unix mailbox. Sample call: # gawk -f expire.awk expire=21 box > new && mv new box

BEGIN {

# We need a little calendar: months= "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"; split( "0 31 59 90 120 151 181 212 243 273 304 334", offset);

# Default expire is 14 days: expire = 14;

# This is the timezone for Central Europe. Let's forget about DST. timezone = +01; now=systime(); }

NR==1 || !inparagraph && $0 ~ /^From / {

# The envelope has this form: # $1 $2 $3 $4 $5 $6 $7 # From czyborra@cs.tu-berlin.de Thu Dec 16 00:22:43 1993

month = 1 + int (index(months, $4) / 4); date = $5; year = $7 - 1900;

days = date - 1 + offset[month] + year * 365;

# Leap year calculation: This will get one day off track # in 2100 AD. I'll be dead by then. We may already run # into an integer overflow in 2038: days += int (year / 4) + (month < 3 && year % 4 == 0 ? -1 : 0);

# Days were since Jan 1, 1900, we need 1970 instead: days -= 70 * 365 + 17;

split($6, time, ":"); hour = time[1] - timezone; timestamp = days * 86400 + hour * 3600 + time[2] * 60 + time[3];

expired = now - timestamp > expire * 86400 ? 1 : 0; }

{ if (!expired) print; # Messages must be preceded by a blank line (length == 0) inparagraph = length; }

# End of Gawk expire mail


End [toc]

End of document

This material can be publically distributed and copied with the permission of the Author, provided that you preserve the Author's name and that you distribute it in full and not partially. If you quote parts of this document, please always mention author's email address or http reference where to get the document you refered to.

This file has been automatically generated from plain text file with perl 4 script v1.51 t2html.pls
Document author: Jari Aalto
Url: ftp://cs.uta.fi/pub/ssjaaa/pm-tips.txt
Contact: <jari.aalto@poboxes.com>
Html $Doc id: 1998-03-10 10:03 $