HCS Mail Statistics (Fall 2010)

One of the Harvard Computer Society's most prominent public services is mail. HCS hosts 5,000 mailing lists, and 1,700 student-group and user email/web accounts. We thought the community would find it interesting to see some of our mail statistics for Fall 2010. We started archiving log data in late September 2010, continuing through the present, with an interruption in mid-November. This yielded 25G of raw data.

Recipient Domains

Recipients of HCS mail included addresses from:

  • 250 Harvard sub-domains (eg: college.harvard.edu, hms.harvard.edu, cs.harvard.edu),
  • 400 .edu domains (clarification: we counted *.harvard.edu as one domain, *.mit.edu as one domain, etc),
  • and 5,000 domains total.

Let's take a look at the top 20 domains to which HCS delivered mail:


Considering our user base, this seems fairly reasonable — mainly Harvard email addresses, several personal webmail providers (Gmail, Yahoo, etc), and a few nearby schools (MIT, etc).

Daily Traffic

Here's a look at our average daily traffic, over the course of a week (stacked graph):

Mail traffic drops slightly toward the end of the week (perhaps as people go out instead of sending email). Traffic once again rises on Monday, as another workweek begins.

Hourly Traffic

Let's take a look at our hourly averages, over the course of a day (stacked graph):

Corresponding to sleep schedules (at least in college), there is a lull in mail traffic from 3am-8am, after which traffic picks back up and peaks around mid-afternoon, when presumably everyone is awake.

House List Activity

HCS hosts open lists for 11 of the 12 undergraduate houses (all except Leverett). Here's a normalized ranking of their overall activity:

3.4 pfoho-open
3.3 mather-open
2.8 adams-schmooze
2.7 kirkland-list
2.3 currierwire
2.3 eliot-list
2.2 cabot-open
1.6 quincy-open
1.4 moose-droppings
1.4 lowell-open
1.0 throptalk

This means that pfoho-open generated 3.4x more traffic than throptalk, for example.

However, some houses are larger than others, so it might make sense for larger houses to have more traffic. Re-normalizing for the number of students in each house (the best data we could find was admittedly old data from Spring 2009), we obtained per-student rankings of list traffic:

3.5 pfoho-open
3.4 mather-open
3.0 kirkland-list
2.8 adams-schmooze
2.6 cabot-open
2.5 currierwire
2.2 eliot-list
1.5 moose-droppings
1.4 lowell-open
1.4 quincy-open
1.0 throptalk

This means that pfoho-open generated 3.5x more traffic per person in the house than throptalk, for example. For the most part, there appear to be no dramatic changes in the results.

Other List Stats

Our 5,000 mailing lists currently serve 120,000 unique email addresses.


Some of these numbers may seem rather large. For example, it seems we processed an average of 1 million messages on Mondays. Note, however, that this does not necessarily mean that we sent emails to 1 million unique people on Mondays. Here are some things to note when considering the numbers we mention:

  1. The phrase "processing messages" encompasses sending, receiving, deferring, bouncing, and rejecting messages.
  2. Many messages we process might be spam.
  3. People often have several of their email addresses subscribed to lists.
  4. Many misspelled / defunct email addresses are subscribed to our lists, which cause additional bounces and mail traffic.

Parting thoughts

Some of these numbers are truly remarkable. It would be interesting to see what sort of mail traffic FAS IT, or providers like Gmail, encounter on a day-to-day basis. As always, the Harvard Computer Society continues to strive to provide the highest quality service to the Harvard community. If you are interested in joining HCS or learning more about what we do, feel free to email info@hcs, or drop in on one of our weekly meetings!

Here's to a downtime-free 2011 with even more mail traffic!