Thing Doers and Other Class Types
There's a whole world out there of patterns and anti-patterns and design methodologies and frameworks, and all of these resources are useful and important to the sort of developer who's willing to do the research and study and be focused on being all that he can be, as it were.
But I've found that most developers are not of this sort. Even I (as excellent as I am!) have a lot of trouble getting through a book of software patterns. Some of those patterns can be very hard to understand, let alone apply to a real world situation. What is needed, I think are a few simple guidelines that can help young, inexperienced, developers, or simply developers with lives, to write better code. If such a resource exists I do not know of it. While McConnell's Code Complete has a lot of great stuff, most coders are not going to slog through all 1,000 pages. Should they have to?
Dan long ago promised us a sort of Strunk & White for coders. Where is that precious book? I think we need it. [Ed note: I'll get it written someday!]
The Five Types of OO Classes
As sort of a primer on Object Oriented programming, I'd like to offer a few insights and opinions of my own.
After six years of experience in programming in an Object Oriented language, I've identified five different types of classes that occur in OO programming. Knowing and understanding the distinction and knowing how to build each of these types of classes is, I think, essential to writing decent code that is easy to read, understand, debug, and maintain. My five types of classes are:
1) Property Bags—the classic concept of an object. Basically a class that represents a real-world or virtual thing of some sort and in practice is basically a handy repository for a number of properties about that thing.
2) Thing Doers—A object that has a task to perform. If well designed, this task will be clear and concise.
3) Utility Method repositories—Such a class is often called Util or SomethingUtil. This is a repository of a number of common, static utility methods.
4) Plumbing Classes—Classes that are required not for the business rules, but for the implementation. Here I mean such things as servlets, EJBs, and so on. (Sorry to be Java-centric. I'm certain there are equivalents in other languages.)
5) UI Components—In fact, these could be considered a special case of Plumbing Classes, but they are so important and specialized that they deserve their own category.
I have created this list not as some sort of moot exercise in taxonomy but to make a larger point and that point is that each type has its use and misusing one or more of these types leads to sloppy spaghetti code.
Most developers use type 1 and types 3 through 5 with complete ease. Few developers ever write a class of type 2 and this is a darn shame because the Thing Doer is the most elegant of all of OO creations. Creating Thing Doers is an art I wish to share.
My Beloved Thing Doer
Far be it from me to be prescriptive (hey, it's my blog, though), but business logic should only and always be placed in Thing Doer classes. Oh, what a wonderful object-oriented world you build when you populate your code with Thing Doers. Let me see if I can explain.
Far too common is the placement of code into plumbing classes. A famous anti-pattern in Java is the monolithic servlet. This could also be the monolitic EJB. I'm sure .NET has equivelents to both of these constructs.
A naïve developer is given an assignment to create a bit of functionality. This developer knows she needs to implement the functionality in a J2EE server, so this means creating an EJB. So our young developer gets to work on here EJB and pumps it full of functionality, i.e., business logic. What's wrong with this approach?
Oh, so much I don't even know where to begin. But let me throw in here that I have done my share of such coding.
First of all, when you develop like this, you are tied to your implementation. If someone wants to migrate to a different system (Spring, perhaps) you have a whole lot of migration to do. Yes, you'll have to do some migration anyway, but you can save yourself a lot of potential trouble.
Even more important, though, is that code as described above is completely avoiding the strengths that OO can offer you. The strength that a Thing Doer object leverages is the module-level variable. If you've got a complex process with a number of variables that need to be shared, there is no need to pass every single one of them through to every method. You can make them module-level and each method can act on them.
No such approach is possible from inside of an EJB. Since an EJB is essentially a plumbing object, the developer has no control of its creation. It may be (almost certainly will be) shared. You can't use module level variables in an EJB or a servlet. So very many developers do not get this, but it is a great big no-no. Other threads will come along and change your values. This is a very dangerous practice.
To avoid this approach, many developers simply pass every single value needed from one method to another in the same class. Developers who do this fail to realize that a method call that does not access shared data in a module-level variable is essentially a static method call. Why not just put everything into a big Utility Method Repository and be done with it?
Instead of putting your business logic right there in the plumbing class, your plumbing class should instead do as little as possible. Mostly what it should do is instantiate and execute a Thing Doer class.
Let's say your EJB needs to create an invoice. Then your best approach is to start thinking about the design of a Thing Doer class called InvoiceCreator. InvoiceCreator might need to find an Order to go with the invoice. In that case, you might think about creating another class called OrderFinder.
What is left in your plumbing class is only what's necessary for the dispatching of your business logic to take place. In most cases this will be almost nothing, though it may be necessary to put validation or authentication code in your plumbing class.
I believe very strongly that all OO developers should learn to think in terms of creating Thing Doers. Such classes are easy to debug (no plumbing code, so no containers required), easy to retrofit to different implementations, easy to stitch together, to re-use, to write, debug, maintain and so on. I think such classes are the essence of OO coding.
On the Misuse of Property Bags
Another common approach (and I believe I will get some defenders of this here) is to put a lot of business code into a Property Bag class. To me this is a crude approach, but I'm prepared to admit that this is a personal preference. I find, though, that it can be hard to maintain an application with a lot of classes that look like property bags but which are laden with business logic. Should your Invoice class contain a createInvoice() method? Should your Order class contain a findOrder() method? I've found that once you start down that road, your property bag classes become huge and complex and difficult to maintain. Such an approach certainly violates the concept of cohesion.
There is much to be said for the elegance and simplicity of a property bag class that is nothing more than that. And the naming of a Thing Doer can make it's function clear. What clue will you have that the method for creating an invoice is in the Invoice class?
I believe that these are some simple and easy-to-apply concepts that can be a big help to OO developers. Arriving at these realizations was certainly a big help to me.
re: Thing Doers and Other Class Types
Sorry to (strongly) disagree with you, put any developer worth his salt will read, study, and re-read Code Complete. It is indispensible in becoming a great programmer.
Any developer who hasn't read it or refuses to read it because of its length (or any other reason), just isn't worth hiring. Remember, 80% of the work gets done by 20% of the developers. It is better to have fewer, excellent developers (the kind who will read Code Complete) than it is to have more, poorer developers. The poor developers just slow down everybody else.
And implying that only developers without lives read books is, quite frankly, horrible. Developers that have a self-improvement plan, which involves reading magazines, journals and (sometimes rather large) books, show dedication to their craft and self-improvement. They understand that Things Change, and are diligently striving to stay current and get better.
There is no such thing as a perfect programmer.
I think, and I may be wrong, that you're talking about...
...the Command pattern, in terms of classic GOF patterns. Or at least a derivation of it (note the small-d use of that word -- I'll not be attacked for misusing the nomenclature...)
I very much see your point on the subject of encapsulating system "plumbing" classes away from data- or biz-logic-specific classes.
But on that same note, while I certainly see the utility of having property-only classes in certain circumstances, the same idea of encapsulation makes that a little too closed-off for my tastes.
An Invoice, it can be pretty soundly reasoned, had better know how to do Invoice stuff -- including CreateInvoice, SaveInvoice, etc -- including certain types of business logic directly pertinent to Invoices.
The trick is making sure that no other classes know or care how the Invoice class does all that. And also (to get closer to full agreement with you) to make sure that areas of interoperation w/ other classes are encapsulated themselves. Again, here we go with the Command pattern (or any number of other GOF patterns, actually)
I think there are some platform-related perspectives involved in your statements that can't be ignored -- and that's neither good nor bad. In my (admittedly limited) experience with Java, I found that there were ways you'd do things in Java that you might not do elsewhere, and vice versa -- pure OO strategy aside.
And, unfortunately, I also have to echo Aaron's comment. Reading is, as the saying goes, fundamental. I have learned an important lesson after 2 years on my current gig.
Tell Don't Ask
Thanks for the interesting post and for the comments so far. I still intend to comment more fully on this thread, but in the meantime please allow me to add this excellent Pragmatic Programmers article called "Tell, Don't Ask" to the mix.
Dan
The Reading Habits of Programmers
We can "ought" and "should" all we want, but the fact remains that most programmers are not extremly well read on their chosen vocation. I have encountered precious few in my travels who have ever cracked Code Complete or any other such tome, and most of those either recommended it to me or were recommended it by me. We all must work with and often instruct and review the work of such developers.
The need for a concise book on the subject of good (dare we hope for excellence?) coding practices remains. I find that, as useful as Code Complete is, it is somewhat lacking as a primer and resource for developers. This is not to knock this excellent work at all, but to say that it is not what it is not trying to be. The numerous citations of scholarly studies make the case McConnell sets out to make, but tend to fatten the book up and make it seem more an academic treatise than a primer for developers. And its largeness I think to some degree, at least, reflects a marketing strategy that says "Computer books are big!"
What I'd love to see would be a nifty little book of maybe 100 pages that every developer would keep on his/her desk that would contain simple rules for creating elegant code. I believe such a book is possible and maybe, just maybe, I'll have to write the darn thing myself.
I purposefully avoided connecting my discussion above to any specific pattern. Yes, there is some similarity to the command pattern, but what I'm arguing for is a more general usage.
Edward's comments on specifity are right on target. Specifity is all relative, however, and overly specific names can become nigh-on unreadable and there is always a delicate balance.
I figured that my exclamation that business logic code does not belong in "property bag" classes would bring up some debate. To a certain extent, I'll admit this to being a matter of personal style. Well, let's hope that it's actualy project-wide style.
But I think there is much to be said for separating the concerns of data (Invoice) from business rules (InvoiceCreator or whatever). Such a separation makes it easy to create additional, more specialized classes as needs change (InvoiceCreatorFromGoofyOrderForm or whatever). It makes a certain semantic sense as well. Should an invoice create itself? Well, in the "real" world, self creation is generally limited to God. Mapping out various Things and Actions can be a useful architectural practice and separation lends itself well to this. It also splits up code into small files, which is always a good idea (as McConnell, his name 1,000 times be praised, would no doubt remind us).
OO or Not?
Rob writes:
But I think there is much to be said for separating the concerns of data (Invoice) from business rules (InvoiceCreator or whatever). ... Should an invoice create itself? Well, in the "real" world, self creation is generally limited to God.
If I understand you correctly, this spins exactly into why I posted the Tell, Don't Ask link in my previous comment: you are arguing for a decidedly non-OO design style, something more like a hybrid procedural-OO style. In pointing this out it is not my intention to scold or argue for some kind of pure OO ideal, but I do think it is interesting to have this discussion, if for no other reason than to help us understand what "object oriented" means today.
There are definitely those who would say that there is a "correct" way of doing OO, and "fully OO" languages like Smalltalk pretty much enforce this practice. But there is a whole generation of people who have probably never worked in a "pure OO" language, but instead have come up developing web-, client-server, and database-oriented applications using "hybrid" languages like Visual Basic, Java, and C#. The design patterns and best practices that have developed in this world are focussed on solving other kinds of problems, like scaling simultaneous web requests, minimizing marshalling across process boundaries, and working within platform constraints.
So I don't have a preconceived notion of "correct" in mind. But I do think that there is value in examining the principles, practice, and results behind hybrid procedural-OO designs. Take this excerpt from the aforementioned "Tell, Don't Ask" article:
Alec Sharp, in the recent book Smalltalk by Example, points up a very valuable lesson in few words:
Procedural code gets information then makes decisions. Object-oriented code tells objects to do things.
--- Alec SharpThat is, you should endeavor to tell objects what you want them to do; do not ask them questions about their state, make a decision, and then tell them what to do.
The problem is that, as the caller, you should not be making decisions based on the state of the called object that result in you then changing the state of the object. The logic you are implementing is probably the called object's responsibility, not yours. For you to make decisions outside the object violates its encapsulation.
This seems like a highly useful distinction to me. Rob, how do you think your class design suggestions fit into this?
Dan
Decisions Outside the Object
Dan's point about few developers working on "pure" OO is a good one. Purity is for the religious, I think. For most purposes, practicality rules, and the Tell, Don't Ask article makes some good points about the limitations and tradeoffs inherent in purity, as such.
The distinction I assume you refer to, Dan, is between telling objects what you want them to do and asking them about their state and making a decision.
The problem with this distinction is that sometimes you are actually in the object that needs to make the decision. I suppose you could say that you would then be making decisions about your own state (you being the object, I guess). But that line of reasoning only goes so far. Suppose what you need to do is send an Invoice to the account reconciliation department if more than five line items cost more than $20?
I guess the part of the above I left out was "and then tell the object what to do."
Should the object, in this case the Invoice, be responsible for its own reconciliation and routing to the accounting department? That would be one heck of a big class file by the time you got done with it.
Unless you think of, say, the InvoiceReconciler, as the object and the various Invoices that the reconciler reads and acts on as the reconciler's data. But I think this is basically nonsense.
If reading the state of an object and deciding what to do is to be considered procedural, well, that's not a major issue for me. But to me the most useful application of OO technology (as opposed to methodology) is to create many focused "Thing Doer" classes and have them act on and pass around dumb little property bag classes. You are never going to be able to cram all of your Invoice functionality into your Invoice class. Or rather, doing so would result in a mess. So I would argue that you might as well not put any of that functionality in there. Treat all the functionality the same way. As business rules that read and act on Invoices.
The larger point in the article, however, that code that asks (queries) should be separated out from code that tells (commands) is an excellent one for the reasons the author specifies.
Methods are the Bread of Life
This is a key statement from that article that outlines my only real disagreement with Rob here:
The fundamental principle of Object Oriented programming is the unification of methods and data. Splitting this up inappropriately gets you right back to procedural programming.
Real-world concerns, in my experience, almost never obviate this idea. I don't find that the proverbial Invoice class example becomes any more unwieldy if it knows how to load its own data from a DB, how to save itself, how to reconcile itself if so instructed, etc.
The key element here is that, as big and ugly as it might be unto itself, the only thing that matters to a caller is the Invoice interface. Only a certain percentage of the methods in a class are going to be publicly exposed, and that's really where you want things neat and tidy, first and foremost.
Now, this is not to say that all your private methods can be a rats' nest. That's on you. But my experience is that once you let the concept of modularity take over, you will have an explosion of methods, but it won't be "messy".
Another aspect of this is borne out in the PP article's discussion of the Law of Demeter. I will sacrifce multitudes of dinky-ass methods to acheive minimal coupling ANYDAY.
Cohesion
Actually, I have to admit that I'm not sure I understand what the author means by "the unification of methods and data." Seriously. I've thought long and hard on it. Does he mean the unification of methods and data in a single class? If that is what is meant, then I suppose I am an OO heretic.
Andy said:
I will sacrifce multitudes of dinky-ass methods to acheive minimal coupling ANYDAY.
Dinky-ass methods are often very very useful and not just for the minimization of coupling but to make the code readable, easily maintainable, and to always make the intent of the developer clear.
I would add to this that I also happily create multitudes of dinky-ass classes to achieve cohesive code. In my invoice app (yes, this is a real app I've done a lot of work on) I've got classes called InvoiceStorer (540 lines), InvoiceReconciler (322 lines), InvoiceReleaser (333 lines), InvoiceSearcher (166 lines), InvoiceDAO (a mere 59 lines), plus about 20 "rule" classes that are used by InvoiceReconciler in a true command pattern. In addition there are various other support classes without Invoice in the name, PriceSearcher, for one, that are used by the invoice app.
Were I to have the Invoice "know" how to do all of these things, I would have a very smart invoice, but I would also have one great big messy class file.
Cohesion, in this case, is achieved (well, attempted, anyway) by creating a number of classes, each focused on a subset of invoice functionality. I'm not sure I could see an advantage to boxing up all of this functionality in Invoice. Perhaps this is not what anyone's saying I should do, but perhaps it is.
Yes, all of these classes are tightly coupled to the structure of my Invoice class. But even if we packed all the code in these classes into the Invoice class, the coupling would in essence be there. It wouldn't go outside of the Invoice class, true. But all of the code would still have to be modified in exactly the same way no matter where it happened to live. Thus, I'd argue that any lack of coupling achieved is window dressing only.


Comments on ThingDoer
The concept of ThingDoer may be one level overgeneral. What I would like to see is customerAndOrder2Invoice.
This is because of the discovery during the structured programming epoch that the "ideal" module's functionality could be expressed in a simple sentence.
createInvoice? From what?
I concur that developers and their managers, because of the fashionability of the adjective "stateless", and, a consequent good/bad moral opposition between stateless (good) and with-state (bad)...in which the manager subconsciously associates state, in module level values, with secrets, presuppositions, agendas and sabotage, tend overuse parameter passing.
The management ideal is that not only code but also coders be "stateless", without history, who will arrive freshly-scrubbed at work ready to find that their cheese has been moved.
This creates a prejudice against your thingDoer because unlike a stateless daemon, it has a soul: its state.
In the methodology in Build Your Own .Net Language and Compiler, I strongly recommend that (1) all module-levels be more or less rhetorically gathered into a single Structure and that (2) business-logic-independent code monitor this state for sanity and conversely insanity where two module-level variables have "impossible values".
Therefore I agree on the value of having a thingDoer. But, if specific instantiations of this pattern have as their name and spec an expression, such as createInvoice, with no input, it is not clear where the Invoice comes from.