A document changes.
An Email gets sent.
What could be easier?
Well, off the top of my head: open-heart surgery, underwater basket-weaving, and giraffe tooth-brushing to name a few things. SharePoint alerting architecture is a powerful communications tool that provides a convenient mechanism for users to consume information from around the site.
That is, if you don't mess with them.
But of course, we are going to mess with them. We want to control the time, frequency, and appearance of the Emails. The default out-of-the-box settings work, but not quite well enough. All that my client wanted to do was simplify (not go nuts, but simplify ) the standard digest (daily) Email template, and have the notifications shoot out at 9:00 AM. I estimated about a day to do this.
All we had to do was comment out some CAML from the alerttemplates.xml file (located in 12\TEMPLATE\XML) and run an STSADM command, right? Remove the links and columns that they didn't want in the template, and set the time when alerts go out. RIGHT?
To make a story that is longer than it is interesting shorter and more interesting, we went from OOTB alerts to a custom template to a custom handler using IAlertNotifyHandler to alerts not working for a month to users getting Email bombs of duplicate alerts. We went through three different solutions, two different Microsoft support representatives, and one scary spelunk into the SharePoint database. However, we came out safe and sound on the other side, with a cathartic clarity around how SharePoint daily alerts actually work. It wasn't a perfect outcome; there was quite a bit of drama, but a happy client is a happy client.
What I'm not going to do is take you through all the steps required to either customize an alert template, or wire up a code handler to build the Email from scratch (which is what I opted for, since alerttemplate.xml is 10,211 lines long and besides I really like writing code). Someone already did a good enough job of that here.
Instead, I want to provide a high level description of how SharePoint 2007 alerting infrastructure is put together. As I dug though the dumpsters of the Internet looking for any clues to this end, I heard a lot of rumors and false positives and frightening tales. I also came across a new golden nuggets, either from dependable sources, my own research, or the words from Microsoft's mouth itself. So what I'm going to do is simply list out the pieces that work together to make alerts work, having already scrubbed out all the wrong turns I made while traversing down this path.
So here are the truths about daily alerts, in Q & A form...
Which timer job executes daily alerts?
The "Immediate Alerts" job.
Wait isn't that job that runs the immediate alerts?
So the one job handles all alert notifications?
There is no "Daily Alerts" job?
Which server physically executes this job?
Officially: All timer jobs run on all front end servers. So any web front end (or indexer, as I've seen) that has the proper assets deployed to it for a particular timer job (DLLs, XML files, etc.) can execute it successfully. Unofficially: not really; even though Central Administration -> Operations -> Timer Job Status reports that the Immediate Alerts job runs on all servers, only one is actually doing the work.
To determine which one, you have to dive into the database. I know. It's ugly; it hurts. Shush. Run the following query against the content database for your web application:
What this does is return the one and only row there will ever be from this table. The two columns are a name of a server, and a date. All we really care about is the name of the server: this is the one that's going to do the notification work. If you want to change it to another server, stop the "Windows SharePoint Services Timer" service on this box, and wait for another server to jump in and acquire this lock.
When do daily alerts get sent?
OOTB, alerts are enabled, and the Immediate Alerts job runs every five minutes. What these runs do is ask the question: Are there any alerts that want to be sent out, but haven't been? All alerts that answer "Yes!" get sent out.
So how do daily alerts know that they want to be sent?
Every alert has a time. For daily alerts, this time defaults to one day (or, more accurately, 24 hours) from the instant they are created. If you create an alert OOTB, you can select an hour on the UI, and the alert will be sent 24 hours from that time. In either case, this "Alert Time" is stored as a real, absolute time, not a TimeSpan.
So every time the Immediate Alerts job runs, it checks for alerts whose time is less than the current time. For each one, it fires the Email, and increments this time by another 24 hours.
What about the job-immediate-alerts property?
All this does is regulates how often the Immediate Alerts job runs. This has no effect if alerts are disabled (the "alerts-enabled" property is set to false). Remember, these properties are specific to a web application.
And what about the job-daily-alerts property?
The main purpose of this property is to confuse the hell out of everyone. This command has been deprecated since SharePoint 2003, but was never removed from the command line.
It has NO effect.
NOTHING. Don't ever think about it. Ever.
So in summary?
In summary, alerts store the next time they want to be sent. Every execution of the Immediate Alerts timer job on the server that has acquired the "TimerLock" (for that content database) will send any daily alerts whose time is in the past. Finally, it will increment these times by 24 hours.
Now, there's one other phenomena that I've observed with all this alert stuff: duplicates. And just duplications, but DUPLICATES - as many as a half dozen extra Emails per each unique notification. The most bizarre case we saw was when a user had an alert against a document library that saw five changes in one day, and he received five Emails: one with all five changes listed, one with only the first four, and so on, down to one Email advertising just one change.
What was causing the duplication? The prime suspect was of course the customization of the template, which, admittedly, had a bug, and was broken for some time: there were no alerts for weeks. But eventually I fixed it, and proved it by writing to the ULS logs each time the handler was invoked. And indeed, it was called exactly once for each Email the aforementioned user received. So why was one alert sending out so many notifications?
The most bizarre aspect of this problem was how it was solved, because it solved itself. It was as though the SharePoint alerting infrastructure had a cold: it got worse and worse, barfed, and then started to get better and better. Seriously. My user reported that he got fewer and fewer duplicates each day, until it completely worked itself out and only sent out one notification per alert; each Email containing a list of changes that that document library had experienced over those last 24 hours.
Of course this is unsettling; problems that solve themselves usually have the propensity to start themselves back up. I had a hunch, and I don't blog about hunches. However, I will blog about hunches that Microsoft Premier Support endorses!
First, a bit of background: there is another database table that people seem to get their hands dirty with when it comes to fixing alerts: EventCache. What from I can tell, this is the backlog of all changes made to all document libraries. We can then think of alerts as sort of queries into this table. Once an alert gobbles up a change, that change is marked as processed and won't be alerted against again.
Therefore, a daily alert notification is actually the intersection of all unique changes that an alert (which is welded to a document library) "finds." What happens if those changes in EventCache are never processed? They are dangling there, like icicles, waiting to melt back down to the ground. If alerts are broken, these changes seem to just stack up.
Eventually, when alerts are fixed, that stack will topple. It will be like unclogging a drain. However, that's not what happened. The Email duplications diminished slowly, as if each alert could only pull changes from one day at a time. In other words, the alert, when activated by the Immediate Alerts job, doesn't seem to blindly ask EventCache for "all unprocessed changes for this document library" (as I would assume) but rather for "some 24 hour block of changes." Maybe it's that timeframe that gets out of whack, and changes from two different days are processed in the same batch. Or perhaps the logic is flawed, and times aren't updated properly if the notification system is malfunctioning due to a bug somewhere in the template customization.
It still makes more sense to me that, once repaired, all notifications will be shot out in the next batch. Furthermore, I think it's counterintuitive that "old" changes should be notified at all. If a user hooks up an alert, they want new information - new content. The apropos availability of old news isn't new news.
Each day as the alerts were "worsening" and then "healing" themselves, my client user forward me all the Emails he got. So did another user, who only had one alert, and only got dups sometimes. The point is that there were no discernable patterns. I poured over the frequencies among alerts, number of Emails, and number of changes in each Email, but came up with nothing. I spent hours digging though verbose ULS logs, only to further validate that my custom handler was called exactly once per whatever it was that SharePoint was detecting as a notification.
When my Microsoft rep backed up the above hunch, I was encouraged, but not satisfied. Unfortunately, this remains a mystery to this day. Microsoft closed the case without any more discussion. My client closed the bug without any more discussion. It's like everyone but is happy that this one little bug - which was only supposed to take a day to implement - has magically disappeared.
Well it's time for me to join their ranks, as there is no more time, money, calories, or sanity to invest into the duplicate alert issue. Usually my posts have happy endings, but alerts are emo like that. If anyone out there has any ideas, please don't hold back. Hopefully I was able to at least help a little bit in terms of cracking open the black box of enigmatic SharePoint 2007 daily alerts.