In this post I will document some of the observations and conclusions I have come to in the process of hammering on the event handler functionality of SSIS, trying to see how it ticks, making sure I understand I can use it properly in my system, which is a multi-package system with some specific logging, exception handling, and monitoring requirements.
Background
I want to make use of event handlers to accomodate the operational logging needs of the overall system, which will encompass multiple SSIS packages and external "job control" and monitoring applications. (By "operational logging" I mean a level of domain-specific logging that is apart from the purely mechanical logging that is built into SSIS). (For the record, I'm decoupling my operational logging from the SSIS logging, but I will be able to link them after the fact because all operational log entries will store the ExecutionInstanceGUID, which I can use to link back to the SSIS logs if need be. I'm also storing the ExecutionInstanceGUID in the exception log, which I'm hoping will afford the system with some pretty sweet monitoring and troubleshooting capabilities.)
Getting back to my operational logging needs: For example, I need to log when a package starts and finishes (again, separately from the built-in SSIS logging), with a different "finished" log entry for success and failure. I was also hoping to use the OnError event handler at the package level to log exceptions, and maybe also to use the OnProgress event for logging incremental progress for long-running packages. One of the reasons I wanted to standardize these things in event handlers is that I could remove these concerns from future package developers in this system by creating a package template that included the proper event handlers.
My Issues
With all due respect to the SSIS development team (which clearly has done a pretty awesome job overall with SSIS), here are my issues with SSIS event handlers:
a) OnError cannot stop the flow of execution. It's nice that I can use this to log errors (though it would result in a lot of duplicate logging--see C below), but as a centralized error handler in the sense of structured error handling, it's not useful. I realize that this was probably the intent of the designers, but to me an effective exception handler takes control of the flow of execution, somewhat like a 500;100 custom error page in classic ASP or a Catch block in C# or Java. The MaximumErrorCount property provides some relief in this area.
b) Cannot debug script tasks in event handlers. This appears to be a bug. I can't get the debugger to stop on a line of code in a script task in an event handler. Strangely, if I set a break point in a script task in an event handler, a VBA code window will pop up for the first script task in the package (even though the breakpoint is not set in that task), but there will never be a break in the task where I set my breakpoint.
c) Events are not discriminating enough--they fire too often. For example, if I create an OnPostExecute event handler at the package level on a package with one sequence container and three chained script tasks, the *package-level* OnPostExecute event handler will fire *five times*! That means that if I want to use the package-level OnPostExecute event handler to, for example, log when a package is finished, I have to put an explicit filter in the code of a script task to test the SourceName or SourceID system variables to determine where the event is firing from. And what if I don't want to use a script task? I'd have to put a script task in front of the task I really wanted to use, with a precedence constraint, just to ensure that the real task I want does not get called when it doesn't need to. I don't like it, but it looks like this is what we're stuck with. (This post also describes further the behavior I've observed with OnError firing extra times [1].)
I have found one technique to be useful to help ignore the extra firings of an event handler (this code assumes you have put the variables SourceName and PackageName in the ReadOnlyVariables field on the script dialog):
Public Sub Main()
'Assume success
Dts.TaskResult = Dts.Results.Success
If Dts.Variables("SourceName").Value.ToString() = _
Dts.Variables("PackageName").Value.ToString() Then
System.Windows.Forms.MessageBox.Show("Package OnPostExecute")
End If
End Sub
This code is from a script task in the pacakge-level OnPostExecute event handler. I'm planning to use as a simplifying assumption the principle that, in general, I'm not going to use event handlers below the package level, so this branching logic will probably be standard in all of my package-level handlers. Hopefully there is not a performance hit from dozens or hundreds of useless OnPostExecute events firing and being ignored in this way.
You could compare SourceID with PackageID instead of the names, but this statement from Jamie Thomson's blog gave me pause: "there is a problem in SSIS which results in all packages that are created from the same template having the same package ID." I suppose that could be moot in this context since there is only one PackageID within a given package, but it makes me less enamored of depending on this value for anything.
d) DisableEventHandlers does not stop parent-level events from firing. I have yet to get this property have any effect on whether or not a "parent" event fires for container or task. For example, in the aforementioned example of the package with a sequence container and three chained script task, if I set DisableEventHandlers to True on the container, that does not stop the package-level OnPostExecute event firing for that container. It will stop an event handler for *that* container from firing, but not the package-level handler (which I don't think should be firing in the first place). Even if DisableEventHandlers did work as I think it should (that is, effecting the parent events also), I don't want the implementation and maintenance burden of ensuring that I (and everyone who comes after me) remember to change the DisableEventHandlers property to True for every task and container.
Why I'll Use Event Handlers Anyway
This is disappointing, but I'll likely end up using these handlers anyway because the alternatives are not attractive (some solutions would depend on remembering to set a certain property to a certain value on every single task and container, which I'd like to avoid).
In the case of error handling, if everything was a script task in which I could standardize a Catch block for error logging, I might be able to get away from using OnError, but I need to catch and log errors from other kinds of tasks also, so it looks like OnError is what I'm going to have to use.
I am planning to use the package-level OnPostExecute to detect and log package failure. I may use a package-level OnTaskFailed event handler (taking advantage of the behavior described in C above) to set a flag in a package variable so that when the package-level OnPostExecute finally fires it can check this flag to determine whether to log "success" or "failure." I don't like it, but it seems better than trying to remember the set the FailPackageOnFailure property to True for every single task and container in the package.
(Returning to background, when I'm thinking of success and failure in relation to operational logging, this is a logical, domain-specific concept of success/failure, not the SSIS concept of success/failure, which is purely mechanical. SSIS may see that a package ran successfully, but in the context of the system requirements it may have actually failed. See also this post [2].)
I will also need to ensure that all downstream tasks and containers operate on a Success constraint so that when a task fails the rest of the package won't execute. If I knew that the firing of OnError on task failure [3] was documented, standard, and stable behavior, then I might be inclined to use it to my advantage, but I'm reluctant to do that at this point.
Comments welcome. I'm happy to be proven wrong on anything I've written here.
Dan