Jakarta.Apache.Org Is Down and I Don't Feel So Good Myself
Jakarta.apache.org is down. Should this be a big deal to me? I'm not currently trying to read any of their documentation or download any of their excellent and free software tools. So why does the fact that the Jakarta website is down have such a huge and immediate impact on my life?
Well, let me make a prediction. As I write this, if you're working on a Struts application anywhere in the world and you're trying to start said application, this is having a big affect on you as well. You're frantically debugging, trying to figure out why your danged app won't start. What's going on?
It seems that the world of XML has some very sloppy practices built into it. These are sloppy practices used by developers all around the world, including myself, and they are a recipe for disaster.
We consumers and creators of XML files have been greatly encouraged to validate these files against DTDs. This is a standard practice. Any XML file worth its salt starts off with a reference to the DTD that defines its format. Typically these DTDs reside at a single universal location.
For example, the validator-rules.xml file currently giving me such fits references a DTD at the following location:
http://jakarta.apache.org/commons/dtds/validator_1_1_3.dtd
Can you see where this is headed?
Most XML parsers, it seems, when they encounter a reference to a DTD at a remote location take the trouble to actually read that DTD from that remote location. So what happens when that remote location is unreachable? Well, perhaps it depends on your XML parser, but the standard practice seems to be to fail in a very ugly way. In my case, the fact that jakarta.apache.org is down prevents my web application from even starting.
The practice of referencing remote DTD files is pervasive and, I've come to realize, stupid. When you reference a remote resource like that you're relying on some 3rd party to a) keep their web server up and running forever, b) keep that server at that address forever, c) keep that file in that exact location forever. Why are we relying on such foolish assumptions?
I believe that an XML file should never, under any circumstances, reference a remote DTD. DTD files are small and easy enough to package with the XML file itself. Thus, a reference to a URL can be changed to just name the DTD file.
This is a good practice for so many reasons. a) it removes external dependencies, b) it encournages developers to look at the actual DTD, and c) it means you'll no long have to care when jakarta.apache.org goes down.
Schema vs DTD
I've been thinking more about this issue, and I also asked someone from a major J2EE middleware vendor that I happened to run into what they thought. He differentiated between DTDs and schemas (XSDs) in this area, and suggested that with schemas it would not be an issue. However, if the schema (itself an XML document) references a remote DTD, then you could be back in the same boat...
It seems, though, that the weight really rests on the parser, and whether it is resilient, or at least configurable, in this area. For example, if the DTD is not found or does not come back in a certain number of milliseconds, then maybe the parser can switch to an alternate or proceed without it if possible. Or am I smoking crack? Has anyone else reading this run into problems with remote DTDs? Did anyone else get hit by the same Struts-related outage Rob describes?
Dan
Re : Jakarta.Apache.Org Is Down and I Don't Feel So Good Myself
Yes, I did get hit a few times earlier and in fact I am still hit by it now (1pm PST). Is there any solution to it other than waiting for apache to come up ?
I saw "struts-config_1_1.dtd" is present in the struts.jar and probably that's why it continues to work. So, if I bundle "validator_1_1_3.dtd" in the location within struts.jar, will it work ? I'll give it a try once I get a copy of that DTD.
Sandy
Re : Jakarta.Apache.Org Is Down and I Don't Feel So Good Myself
I stumbled across this today.
I added the validator_1_1_3.dtd to the
commons-validator.jar.
It already contained
validator_1_0.dtd
validator_1_0_1.dtd
validator_1_1.dtd
For now my tomcat hasn't contacted anyone outside of my own net.
CU Sven
Ouch!!
Never thought of it like that, and one work is ouchhhhhhhh. This is bad when that single server goes down, and as you mentioned this has a very bad side effect on your application. I wonder if this is something I might have encountered in the past when my apps didn't work, and I decided to call it a day. The next day I had a colleague work with me to see what he could see, and guess what it worked fine.....
Server Down
Server is the centralised storage for all the clients system all over the world.So it must work all the 24 hours in a day. If it has some problems, there must be a backup server to compensate. So server is very important.


Remote DTD Dependence - Ouch!
I've always wondered about this one, but have never been bitten by it myself.
I'm not so sure about the practicality of using embedded DTDs, if that's what you're suggesting (especially in a situation where a third party manages the DTD and is requiring the XML files you submit to conform to it), but I can definitely see the wisdom in keeping the DTD on the local network if it's a self-contained system. For those remote-to-remote integration situations (to turn a phrase), though, I suspect it's hard to get away from a remote DTD for at least one of the parties--else the mutliple copies of the DTD might create an unmanageable situation...
My hesitation applies primarily, though, to custom DTDs, not necessarily these "fully baked" DTDs issued by standards bodies and the like. If a DTD is stable, it seems safe to make a local copy of it for validation purposes.
Great post. I guess I'm glad I don't have any XML files associated with this today.
Dan