Friday, April 29, 2005

But, whaddya mean the domain model shouldn't capture everything?

One of the problems I'm currently looking at is how to capture real-time domain knowledge in such a way that it can be machine-checkable against the implementation. Admittedly, that's a big sentence filled with big words, but let me give an illustrating example. Suppose you're bouncing a basketball on the ground. This is an control system, because you are trying to bounce the ball against the imperfect ground in a 1-G environment just right so that it gracefully returns to your hand with each bounce. Too much bounce last time? Well, then you'll stop it a bit with your hand. Too little bounce last time? Well, then you'll hunch over a bit to reach it and bounce it a bit extra. Similarly, a rocket is a control system. Using trajectory feedback, engineers and scientists write software to keep the rocket pointing at the right angle and going the right speed to escape the atmosphere of Earth.

Let's suppose that, without telling you, I replace the basketball with a similar-looking ball, but made of extra bouncy material. Now, when you bounce that ball with the same amount of force you'd use for a basketball, the bouncy ball is going to go out of control, likely hitting the ceiling or bouncing into the street. This what happened with the Ariane 5. The same horizontal control from the Ariane 4 was used for the Ariane 5, though it was a much faster rocket. The rocket went faster than expected by the software, an uncaught exception occurred, and the rocket went on a self-destructive spree.

What happened with the Ariane 5? There are a number of really smart people who have opinions about how it could have prevented, but I think it boils down to a bad model. Somewhere, likely in a really dark closet or desk drawer, there was a document which stated that "The rocket goes XX mph." Somewhere else, there was a very unreadable line of code which stored the speed of the rocket in a variable big enough for XX mph. The very unreadable line of code was reused for a faster rocket without re-reading the dusty document. Certainly, it's hard to point fingers at any one person. When I was a programmer, I didn't have access to the models that the well-paid software architects were concocting miles away at our headquarters in Alameda. But wouldn't it have been nice if there were some magical compiler that, when that poor entry-level programmer tried to reuse the control software on the Ariane 5, said, "Ah ah ah! You should re-check your original assumptions about the speed of the rocket before doing that!"

Most of research involves reading other people's ideas about things. Currently, I'm reading "Domain-Driven Design" by Eric Evans. It's a nicely written, down-to-earth book about how to incoporate domain models into the software design process. To be clear, the word "domain" simply means "the world" to software engineers. Except that, for most software, programmers don't have to worry about the WHOLE world. A domain model is simply a description of all of the things in the world that programmers do have to worry about. For example, rocket scientists don't need to worry about the weight of a kitten, but they do have to worry about the gravitational pull of the earth.

I paused at Evans' following statement,
The trouble comes when people feel compelled to convey the whole model or design through UML [Unified Modeling Language].
I'm one of those people. I don't necessarily use UML, but what I'd like to do is capture every detail about the final implementation in the domain model such that we have some kind of "Ah ah ah!" checking mechanism like the one I dreamed up two paragraphs ago. I'm wrestling with my own visions about improving the ways that people write code for dangerous devices (like rockets), and what Evans is saying about the bad habits people have when modeling domains.

Some people, and I'm one of them at times, have the vision that someday a person will be able to write an English description of what she wants her code to do, and some magical translation will occur to generate the code given that description. More and more, I don't think that it's possible, much less desireable. Besides, it would put me out of work.

1 comment:

Anonymous said...

I wrote a "paper" in my software engineering class that applied Goedel's Incompleteness Theorem to attempts at formel modeling, basically arguing that all such attempts were futile. This was entirely from a philosophical, not a mathematical perspective, so it is really just BS, but it might be true.