Preliminary numbers from the spam filter systems

Conceptual overview of system, from August IT Bulletin.
Some implementation details, from the October minutes.

Here are some preliminary stats---not yet automated with pretty pictures.

Date Mail
operations
logged
Received Delivered Spamhaus
rejected
Total
rejected
Delivery
deferred
Oct 21 436620 129159 119102 21773 9969
Oct 22 615741 176342 161106 21382 39015
Oct 23 582819 161320 146855 17175 23312 43323
Oct 24 516969 131195 119738 42106 45911 20708
Oct 25 332870 73818 65166 38322 42079 38565
Oct 26 376255 76906 65671 31705 35480 74799
Oct 27 544149 132174 118437 25540 43539 74482
Oct 28 587707 151495 124216 41534 62344 66971
Nov 08 446280 86942 66282 31642 53953 132536
Nov 09 418564 75359 71569 28054 34023 129277
Nov 10 591846 140726 127821 31031 46476 102443
Nov 11 552509 129283 125770 35144 40729 90157
Nov 12 635015 155032 151906 36254 41825 90021
Nov 13 596586 141512 136626 37388 44312 85549
Nov 14 520909 116398 110391 38018 44572 90133
Nov 15 476263 82958 78402 32539 43754 138578
Nov 16 408663 78985 72845 31353 39344 98346
Nov 17 573807 132521 125410 34183 43262 98286
Nov 18 596762 139570 132524 36852 45943 98310
Nov 19 623655 143343 136429 38837 49046 109270
Nov 20 677490 159566 151706 41079 51458 109958
Nov 21 600802 130012 123667 40877 49206 121563
Nov 22 442323 80522 72212 36301 47232 128001
Nov 23 332234 65981 62305 27367 33099 75759
Nov 24 664515 137088 131507 36771 42760 169893
Nov 25 762182 138433 145690 39235 45556 227838
Nov 26 625116 133959 126821 41537 51896 135139
Nov 27 519829 93925 87783 32527 41760 160267

Notes/Methodology:

  1. Mail Operations logged

    This is simply the number of lines in sendmail's combined log.

    Each message is handled twice, before and after processing. A simple case for successful delivery shows:

  2. Received

    Number of log lines showing from= (the envelope sender address).

  3. Delivered

    Number of lines with status Sent

    One line logged per message, per server. There may be multiple recipients per server. In the present case, a message normally goes to a single server.

  4. Spamhaus rejected

    Number of lines with a ruleset rejection indicating the spamhaus list. This goes into the bounce info for the sender/postmaster.

  5. Total rejected

    All lines reflecting rejection by sendmail's rulesets.

    Largely third-party relay attempts, the rejects may also include non-existent sending domain, etc.

    The standard rulesets and reasons for rejection are:

    check_mail
    Checks the envelope sender address. Sending domain does not resolve, no domain in address (bare username).
    check_rcpt
    Checks the envelope recipient address. Third party relay (foreign-to-foreign domain). Sending IP address does not resolve (temporary reject).
    check_relay
    Checks the sending IP address for access permission. Currently we reject spamhaus plus a few local blocks.

  6. Delivery deferred

    Number of lines with status Deferred

    Recorded any time a receiving server cannot be reached. This may be repeated for a given message. Most of these result from attempts to bounce undeliverable spam. Temporary delivery failures will fall here, also.

    An expected side effect of the system is that the postmaster mailbox of the filter systems will collect a large amount of bounced mail. Some of this might have been rejected out of hand by the internal server (I.e the University would not have to deal with it). Some might have bounced for some reason, and ended in the local postmaster's mailbox.

    We hope the effect will include a net reduction in useless bounces for the postmasters of internal systems, and traffic for the systems. Anecdotal evidence would be appreciated.