There is a perception that enterprise software is easy. The thought process goes something like: “How can adding up numbers and producing reports be as hard to do as programming image recognition, or writing software to control nuclear power plants?”
One thing I have learned is that if you hear a programmer say that kind of
thing, they have never written a real-life enterprise system.
I once worked for an aerospace company. I wrote software to calculate the
exact antenna position for an imaging radar satellite. This was done by
calculating the Doppler shifts in the data being reflected from the ground. The point directly below the satellite was the slowest moving point relative to the satellite, and hence we could find the mid-point in those Doppler shifts. It was more accurate than using the telemetry data. There was a lot of physics involved, and it all had to be implemented in software in a way that could come up with the most accurate measurement. That sounds highly technical and complicated. The theory behind it was, but once all the equations had been developed it was not a super-hard programming task.
I have worked on computer simulations and mathematical optimizations. I
have also built computer language parsers, written device drivers, designed communications protocols, written parts of operating systems,
and analyzed musical harmony. Combining all that with the large number
of enterprise systems I’ve written, it means that I’ve worked on pretty
much every kind of software there is.
So what software is the hardest? Without hesitation I would say enterprise software.
There is a lot of arrogance among software programmers. We all think we are the best and can do anything when it comes to computers. In my experience there is nothing like trying to write an enterprise system to teach a programmer some humility.
But why is this? To examine this question, we can divide software into four
different types. These are rough categorizations, as software is hard to pin down. There are always counter examples to any classification scheme. However, for what it is worth:
1. Scientific and Engineering software
This covers all the aerospace programming such as the Mars Lander or the
Cassini mission. It covers all the process control systems in power
plants and electrical grids. It includes all the engineering analysis to design bridges, buildings, and dams. I’m also throwing AI, robotics, 3D printing, and voice processing in here as well. Also, all the operating systems stuff along with network/communications and encryption.
2. Packaged Software
The most prominent example of packaged software is Microsoft’s Office
suite. This category also includes graphic art programs such as Adobe
Illustrator and Photoshop. It includes all the business software, like
QuickBooks, and a myriad of other programs that are purchased in some
manner by users. The common thread here is that this software is in
effect “certified” by the manufacturer, as they put their brand on it
and hence their reputation.
I include some so-called “cloud solutions” in this category as well. For example, Adobe Creative Suite (now called Creative Cloud) does not work in a browser but requires installation of the program that talks to the cloud. The word cloud here is just advertising. It’s installed software like any other packaged software; it just happens to access the cloud heavily. It also tends to update itself with or without user permission.
This kind of software is sometimes charged for by a monthly subscription, but that is just billing practice and doesn’t impact the issues we are considering.
Microsoft Office 365 is similar to the Adobe product, it is basically software
that updates automatically. In both cases data can be stored on remote
servers, but that is really conceptually no different than storing it on
a local hard drive or a company hard drive on a server in your office.
3. Cloud Software
This software works in a browser. Basically we are talking about HTTP
requests that result in bundles of HTML, CSS, and JavaScript being sent
to the browser. The software is a combination of the JavaScript on the
browser and the back-end code that runs on the server.
4. Enterprise Systems
This is the focus of our discussion here. Enterprise systems are software
that is integrated into the operations of an organization, with the most
common handling general ledger, financials, payables, and receivables.
There are three factors that greatly influence how difficult it is to develop software:
1. Control Over Features
Does the project team have the final say when it comes to features and
changes? Do they have to consult with an outside constituency?
2. Generic or Constant Test Data
How do you generate test data? Is there a lot of existing test data or do
you have to try to construct it? If you have to construct it how much
coverage do you generate?
3. Change Over Time
Does the software have to change its logic over time, yet still preserve
existing information? And are there changes during development?
Let us examine each type of software with respect to these features.
When I wrote software for a radar imaging system, I could take some archived satellite data and run my software and look at the results. I could
keep doing this until it was giving me seemingly correct answers. Then I
could process the image and check the fidelity.
I didn’t have to worry about the format of the satellite data changing,
or the laws of physics changing. It was totally repeatable and predictable.
And testing was straightforward. I could grab any part of the satellite
data I wanted and image it. Analyzing the image indicated how more
accuracy in antenna positioning improved the resolution of the image.
This meant that there was a massive amount of data that I could use for
testing readily available.
In technical/engineering software, decisions on changes almost always rest with the project team. For example, say they have a data entry screen.
The information that is gathered is totally determined by the team to
ensure that they have enough data and options to do the calculations.
They decide how best to lay out that screen, and what options should be
on it. They make these decisions based on the technology and how it
should be controlled. These are decisions made within the team with very little input from external sources.
Most of this kind of software is used for a single case at a time. You run
an analysis of a dam, or a bridge, or you have one space mission. It is
normally for one particular situation. If it is modified, a whole new
version is produced. The new version might require a different format of data but there is rarely a need to be able to work on the old data.
When you look at the three factors as they apply to scientific/engineering software you can see that:
All of this makes scientific/engineering systems less difficult than they might seem on the surface.
In the 1990s I was an owner/partner of Paradigm Development Corporation, a contract-software house that did a lot of work for Microsoft, among other large software companies. We wrote file format converters for MS Word and Excel and wrote other utilities and add-ons.
One project we had was to remove unused lines in the MS Word code base. It can be hard to know in a large, high-entropy code base (aka large
technical debt) if lines of code are actually used. Because parts of the code could modify other parts of the code, it required an engineer to carefully walk through the code to see if a particular block of code was ever used. These blocks of unused, abandoned code are like an industrial plant having machines that are never used and gathering dust, but getting in the way of normal operations.
The MS Word code base was high entropy. That is not surprising given that,
even in the mid-1990s, it was at least a decade old and had been worked
on by a continually changing group of programmers.
You may be thinking, “But, MS Word works well.” Perhaps you use it
yourself. It’s stable and gets the job done without a massive amount of
computer errors.
So how does a high-entropy code base result in a stable product?
It’s all about testing. MS has a huge QA team who are relentless in testing. (Well they used to at any rate. These days the quality of testing at MS has been drastically reduced. Which is illustrated by all the problems that have been occurring in the various releases of Windows 10). Throughout the development cycle QA is testing and reporting bugs. Bit by bit, over time, the bugs are largely squashed, and the remaining bugs are deemed to be minor enough that they can live with them.
At this point the code is frozen, and a last round of testing is done. If
anything new comes up, it is fixed and the software is completely
retested. This is because everyone knows that any change could cause a
major problem. Any change, no matter how minor, should cause a complete redo of QA. This is just QA 101.
This doesn’t necessarily mean they have reduced the entropy of the code.
They may have rewritten some areas that were particularly bad, but in
general they had just found a ledge on the steep slope of system stability. By hammering at the testing and making sure that all the thousands of bugs are inconsequential or manageable, they can produce a version suitable for release.
Packaged software is in effect, certified. We trust software from a reputable manufacture like Microsoft, or Apple, or Intuit. This is because we know they thoroughly test their software. While that doesn’t guarantee you won’t have some problems, they will probably be minor and usually there is a work-around.
But notice there is one thing about the testing of package software that really makes it easier.
Take testing word processing software, for example. A document is a
document. It doesn’t matter what the actual words are, they could be
“Lorem ipsum. . . .” What matters for testing is how the software
handles the general word processing characteristics of a document,
things like: very long words, paragraphs that take up more than a page,
multiple columns, pagination, and footnotes. The actual text is
immaterial.
So, testing package software is not as hard as it could be. Microsoft
maintains a library of test documents and word processing actions. These
documents form an incredibly rich set of test data that pits the new
version of the software against every problem they’ve logged before, and
a whole set of specially-generated samples of extreme conditions.
An office suite is a set of tools to be used by large numbers of different
people to author content, whether it is a presentation deck or a
spreadsheet. There is nothing in the software that is specialized for a
particular user — everything is general, and hence can be tested with a
generalized test set.
The features of a package like MS Word are totally determined by the
project team. Certainly, marketing gets involved and introduces customer
needs and wishes, but if the team is up against a deadline, some of
those feature requests get cut in the interest of getting the product
shipped. These decisions are made by the project team. Marketing might
object, but if the project team says they can’t get it done on time, they will win the argument. If the programmers say a feature has to be implemented in a certain way for technical reasons, they win that
argument too.
Package software is shipped with a clear set of capabilities. These don’t
change until the next release. This means that new features can be done
in the time between releases. The data may not be compatible from one
release to another and may therefore require an import program to move
from an old release to a new one. But there is no requirement to have
new feature code work on both old and new data.
When you look at the three factors as applied to package software:
In this model of software delivery, the supplier runs servers that are
accessible over the Internet (“the cloud”) and clients run software,
browser or app, on their local computer (which could be a phone) to
display information on the screen and allow input.
Clearly one of the characteristics of this model is that the software can be
continually updated (which some might describe as cloud software’s
biggest advantage and its biggest weakness). However, the current
approach to this is to develop a Minimum Viable Product (MVP). This
means software with the smallest number of features that is still of
enough value to attract a large number of customers.
All of the decisions about features are made by the project team. This is a
centralized service that is trying to develop features that are wanted
by the most customers. This means that features are driven by marketing
reasons and certainly not by any outside group.
If there are errors they are fixed as quickly as possible and the new code
is put up on the website. Any client logging in after that will be using the new code. Because of this fast fix-and-deploy cycle there is a tendency to use the clients as the QA department. No one client is that important, so if a particular client is inconvenienced it isn’t a major problem. Hence, testing is easy because there is not as much of it done. Early users of these services have a rough road, so it is essential that the service being offered is compelling enough that clients will put up with it (the Viable part of MVP).
When you look at the three factors as applied to cloud software:
This is probably the second hardest kind of software. It looks easy to start
but can become really difficult when the client base grows very large.
Also some of these applications are enterprise applications. However,
they are generalized applications that force the client to adapt to the
service with, for the most part, limited customization. This means that
although it is enterprise, control over features belongs to the
programming team, which makes things much easier.
Finally, we come to Enterprise software where things are much harder.
Once,I was building a system for a large union. At that time I still believed in methodologies, and I had a design document. I wanted to make what the system did and didn’t do very clear. There was a feature that was clearly listed, in large type, that the system would not have. All the managers signed off on the design. When the system started to be used operationally, one of the staff came to me and said that she couldn’t do her job without that feature. It was an essential operation to the organization. None of the people involved in the reviews had realized that the feature was essential.
Now what was I to do?
I could have taken a hard stand and said that it wasn’t in the
deliverables, but an enterprise system is about having a system that works for that particular organization. I had no real choice. If the system couldn’t handle the union’s operations then it would probably be rejected. I had to add the feature. Clearly, I didn’t have total control over features.
The control over change rests mainly with the user base. The enterprise
system has to allow people to do their jobs and process information. If a
feature is missing it may mean that someone doesn’t have the information they need, or that they will have to go through a time-consuming activity to produce it some other way (usually using a spreadsheet).
Then there are changes. If a tax change is introduced by a level of
government, the organization has no choice but to respond to it. If
senior management mandates some organizational change, the system will have to respond. Changes happen all the time in enterprises and they
require changes in systems to support the new ways of doing things.
In a system such as payroll software, you have to have test cases that
cover a good percentage of the possibilities. Think of all the
possibilities that a person working under a union contract could generate, given all the combinations and permutations of working: statutory holidays, more than eight hours in a day, shifts after midnight, lunch entitlements, booked salary leaves, garnishees, and so on.
Let us suppose there are 20 different such payroll events that are possible
for an employee. Each one is either off or on, so the possibilities are a 20 bit binary number which means there are a million of them. Of course, not all combinations are valid and some events are more complicated in that they have parameters.
If you don’t want to get drowned in a combinatorial explosion you have to
maintain strict orthogonality (independence) among all of these factors.
However in practice some of these possibilities are dependent on others, so that makes a strict formal architecture a necessity. This means you need something like a language to describe the operations and the execution flow so that changes can be accommodated. To make things even more difficult, the independence between these factors can be modified by changes. For example a benefit can become taxable so that something that was independent of the tax calculation no longer is. So things that were safely independent are now not, and that can create significant changes in your code.
Now if this was an aerospace project you could generate and test a million
base combinations with random legal parameters. You could spend a year
or two and write a generator for random legal test combinations and
spend another year running it with billions of cases until it works flawlessly. To make things easier, in an aerospace application, as
opposed to an enterprise application, the relationships between items
will not change.
That’s the kind of testing that aerospace engineers do — because they have the time and budget. No one wants to spend hundreds of millions of dollars to launch something into space and have it fail.
But for an enterprise system that is being built in the midst of continual
change and trying to model an organization structure that no single
person fully understands, you don’t have that kind of time and budget.
Basically, the only way is to do your testing in the guise of installation.
What is usually being installed is the package world’s idea of beta software, at best. All you can do is some simple testing, and then turn it loose on real life and fix it when it goes wrong.
Let’s be honest here. That what happens in the majority of cases.
You can see that in all the other software types the objectives were specified. When I was writing software to help determine the exact
direction a radar antenna was pointing, it was clear what was wanted. We
wanted the most accurate angle that we could extract from the data,
because that increased the quality of the image.
That never changed throughout the project.
If you are working on package software there is a set of features to
implement. In reality some of those features are dropped and some new
ones may be added during the build. But it is the team that really
determines that. It is the software manufacturer that determines what
features are to be included and what ones are not. It is the software
manufacturer that determines how that feature will be implemented and it
does that with input from marketing, who think they understand what
users want, and development, who want to do things in ways that are
easier to implement.
In enterprise computing what is needed is much less clear. It resides in
the minds of the operational staff, who aren’t systems people and often
don’t remember something they need until later when they have to perform some task.
So, what is being built is not totally clear, and the builders don’t have
control over what is wanted; that power is spread out in the
organization.
This is very difficult to control. On top of this, the business logic can
change month to month. This can be the result of some environmental
change, such as government regulations, or it can be because of
organizational reorganization, or policy changes, or contract negotiations.
Consequently, the project team has to deal with constant changes. This doesn’t happen in aerospace programming — the speed of light doesn’t change, Saturn doesn’t suddenly decide to go around the sun in the opposite direction.
In enterprise computing the organization’s operational needs dictate the
features. As an IT person you can complain all you like, but when the
user says “I can’t do my job without that feature,” you are kind of up
against the wall.
In enterprise, unlike other kinds of software, you are trying to build new
software, not knowing what features you may be required to implement
during the project. If the project takes longer than you have estimated (surely that never happens!), then the number of new features inevitably goes up, because all organizations change over time.
When you look at the three factors as applied to enterprise software, you
can see how they add to the difficulty of these systems:
Enterprise systems have some properties that other software simply doesn’t have. Control over features doesn’t reside solely with the project team; the team is at the mercy of people in the organization who are trying to communicate what they need but don’t always know what that is until
later. Testing is almost always done on the live system. Once the system
is running, the code has to be able to respond to changes in the business logic, and yet keep all the past results.
As a programmer, it is much easier to work on other kinds of software.
Trying to develop software while keeping it in sync with the activities
of a changing organization is really, really hard.
If you don’t believe me, just try writing one of these things.
Enterprise software is the hardest software to write.
This story is adapted from my book Avoiding IT Disasters: Fallacies About Enterprise Systems and How To Rise Above Them
Previously published at https://codeburst.io/enterprise-software-is-the-hardest-software-to-write-c76d59725f3