28 January, 2013

Rube Goldberg rules software development


I've often touched on the topic around this articles thesis in my blog, here is a most recent post:

http://sent2null.blogspot.com/2012/03/engineers-versus-programmers.html

:for example. I've often said that engineering is an art while programming is more of a Science...unfortunately for those who are really good at programming...building big complex software is more art than Science.

The differences are ones of scale, the biological analog serves purpose here to illustrate. A living being is an amazingly complex organism with billions of cells acting seemingly in harmony, together over the course of about 70 years for most people...these cells replicate, divide and grow without issue until the whole clock work comes grinding down to a halt under the effects of progressive degeneration. Yet the darn thing runs for 70 years, it is amazing  that it works at all.

Now that we are getting into the code that defines how cells are organized, replicate and grow...we see how. It is an amazingly *over engineered* system, instead of having end to end efficiency in mind it was designed in another way...to be maximally efficient from one step to the next (that's the only way evolution could do it without a watch maker at the works) and so, the happy accidents of efficiency...many due to proximity of chemical or molecular affinity piled up over the last few BILLION years to create the super complex systems of living beings of today...some of which that can run like us to 70 years before falling apart!

Software is much like biology in this way, save the difference is the clockwork is in our classes, methods, objects, threads...however just like in biology, the big picture view over how those many components interact over application run time BEFORE the application is written is a huge mystery to most coders. They are great at defining the optimal algorithm to sort a bunch of digits or find edges between various shapes but bad at seeing down the road a bit to understand how those low level interactions bubble up for the entire system. We've known for decades about the various systems in the cell but only in the last decade are we figuring out how they are instantiated (stem cells) how those cells differentiate to specific types (polymorphism of cells) how those differentiations are regulated across tissue types to make sure (for the most part) the wrong cell types don't appear in the wrong places (cancer). Knowing how a cell works doesn't answer any of those  questions...which are locked up in the regulating and developmental code of the dna (much of it non coding and formerly called "junk"!).

The problem with software development today is that the planning of the engineering side of it is lacking while the planning of the programming side is well developed. So rather than taking the time to really understand the problem scope and shape a generalized solution to it...coders just get to work filling in patches to leaks as defined by a rough understanding of what the client wants. Putting grout and brick in place until the leaks stop...the end result of course is not always pretty or elegant...and like biological systems over time and stress reveal itself to be an inefficient Rube Goldberg like process. I detailed a list of considerations to keep in mind to perform good design in the following post:

http://sent2null.blogspot.com/2008/02/considerations-during-design.html

Now, UML is something that has been pushed for some time, the only time I've used it on a project was when I was using it to extract my class hierarchy from the source code into pretty diagrams I could print. That was it, the over all design of the application had already been diagrammed dozens of times on paper and pencil. The slogan of Apriority LLC the company I founded to write AgilEntity the project in question is "taking the time to properly design". I saw over and over again the problems that happen when a long design stage is not enabled for writing code...you end up building things that work...but just like Rube Goldberg machines. Incredibly fragile to changes that push the system out of the precisely designed for regime. This is a major reason why so many enterprise software projects eventually crumble under their own inefficient weight once load hits them and need to be re-architected. I thank the many horrible engineers out there writing code for all the jobs I've had where I was paid well to help fix such systems!

That said, specifications are different. A spec. can be simply a well detailed header comment above a class or method...it can reference the related classes it will be used in and why and maybe note how it could be changed in the future to enable some yet undesired functionality. The key is enough supplemental information about what the code is doing to allow it to be a) understood and b) changed in relation to all other code that may call on it without causing unseen pathology..once the changes are deployed.

I have a few examples of my developments in AgilEntity chronicled that show how good *engineering* can enable massive refactoring tasks in short time that in less well designed systems would simply not be attempted. Here's one:

http://sent2null.blogspot.com/2009/03/late-hour-super-class-surgery.html

When the time for refactoring is greater than the time of writing from scratch it isn't worth it to refactor.

No comments: