Enterprise Software Is the Hardest Software To Write

There is a perception that enterprise software is easy. The thought process goes something like: “How can adding up numbers and producing reports be as hard to do as programming image recognition, or writing software to control nuclear power plants?” One thing I have learned is that if you hear a programmer say that kind of thing, they have never written a real-life enterprise system. I once worked for an aerospace company. I wrote software to calculate the exact antenna position for an imaging radar satellite. This was done by calculating the Doppler shifts in the data being reflected from the ground. The point directly below the satellite was the slowest moving point relative to the satellite, and hence we could find the mid-point in those Doppler shifts. It was more accurate than using the telemetry data. There was a lot of physics involved, and it all had to be implemented in software in a way that could come up with the most accurate measurement. That sounds highly technical and complicated. The theory behind it was, but once all the equations had been developed it was not a super-hard programming task. I have worked on computer simulations and mathematical optimizations. I have also built computer language parsers, written device drivers, designed communications protocols, written parts of operating systems, and analyzed musical harmony. Combining all that with the large number of enterprise systems I’ve written, it means that I’ve worked on pretty much every kind of software there is. So what software is the hardest? Without hesitation I would say enterprise software. There is a lot of arrogance among software programmers. We all think we are the best and can do anything when it comes to computers. In my experience there is nothing like trying to write an enterprise system to teach a programmer some humility. But why is this? To examine this question, we can divide software into four different types. These are rough categorizations, as software is hard to pin down. There are always counter examples to any classification scheme. However, for what it is worth: 1. Scientific and Engineering software This covers all the aerospace programming such as the Mars Lander or the Cassini mission. It covers all the process control systems in power plants and electrical grids. It includes all the engineering analysis to design bridges, buildings, and dams. I’m also throwing AI, robotics, 3D printing, and voice processing in here as well. Also, all the operating systems stuff along with network/communications and encryption. 2. Packaged Software The most prominent example of packaged software is Microsoft’s Office suite. This category also includes graphic art programs such as Adobe Illustrator and Photoshop. It includes all the business software, like QuickBooks, and a myriad of other programs that are purchased in some manner by users. The common thread here is that this software is in effect “certified” by the manufacturer, as they put their brand on it and hence their reputation. I include some so-called “cloud solutions” in this category as well. For example, Adobe Creative Suite (now called Creative Cloud) does not work in a browser but requires installation of the program that talks to the cloud. The word cloud here is just advertising. It’s installed software like any other packaged software; it just happens to access the cloud heavily. It also tends to update itself with or without user permission. This kind of software is sometimes charged for by a monthly subscription, but that is just billing practice and doesn’t impact the issues we are considering. Microsoft Office 365 is similar to the Adobe product, it is basically software that updates automatically. In both cases data can be stored on remote servers, but that is really conceptually no different than storing it on a local hard drive or a company hard drive on a server in your office. 3. Cloud Software This software works in a browser. Basically we are talking about HTTP requests that result in bundles of HTML, CSS, and JavaScript being sent to the browser. The software is a combination of the JavaScript on the browser and the back-end code that runs on the server. 4. Enterprise Systems This is the focus of our discussion here. Enterprise systems are software that is integrated into the operations of an organization, with the most common handling general ledger, financials, payables, and receivables. So why is enterprise so much harder than the other three ? There are three factors that greatly influence how difficult it is to develop software: 1. Control Over Features Does the project team have the final say when it comes to features and changes? Do they have to consult with an outside constituency? 2. Generic or Constant Test Data How do you generate test data? Is there a lot of existing test data or do you have to try to construct it? If you have to construct it how much coverage do you generate? 3. Change Over Time Does the software have to change its logic over time, yet still preserve existing information? And are there changes during development? Let us examine each type of software with respect to these features. Scientific and Engineering Software When I wrote software for a radar imaging system, I could take some archived satellite data and run my software and look at the results. I could keep doing this until it was giving me seemingly correct answers. Then I could process the image and check the fidelity. I didn’t have to worry about the format of the satellite data changing, or the laws of physics changing. It was totally repeatable and predictable. And testing was straightforward. I could grab any part of the satellite data I wanted and image it. Analyzing the image indicated how more accuracy in antenna positioning improved the resolution of the image. This meant that there was a massive amount of data that I could use for testing readily available. In technical/engineering software, decisions on changes almost always rest with the project team. For example, say they have a data entry screen. The information that is gathered is totally determined by the team to ensure that they have enough data and options to do the calculations. They decide how best to lay out that screen, and what options should be on it. They make these decisions based on the technology and how it should be controlled. These are decisions made within the team with very little input from external sources. Most of this kind of software is used for a single case at a time. You run an analysis of a dam, or a bridge, or you have one space mission. It is normally for one particular situation. If it is modified, a whole new version is produced. The new version might require a different format of data but there is rarely a need to be able to work on the old data. When you look at the three factors as they apply to scientific/engineering software you can see that: Scientific/engineering systems are very much in the control of the project team. The test data is normally readily available. They don’t have to handle changes as well as operate with previous data. All of this makes scientific/engineering systems less difficult than they might seem on the surface. Packaged Software In the 1990s I was an owner/partner of Paradigm Development Corporation, a contract-software house that did a lot of work for Microsoft, among other large software companies. We wrote file format converters for MS Word and Excel and wrote other utilities and add-ons. One project we had was to remove unused lines in the MS Word code base. It can be hard to know in a large, high-entropy code base (aka large technical debt) if lines of code are actually used. Because parts of the code could modify other parts of the code, it required an engineer to carefully walk through the code to see if a particular block of code was ever used. These blocks of unused, abandoned code are like an industrial plant having machines that are never used and gathering dust, but getting in the way of normal operations. The MS Word code base was high entropy. That is not surprising given that, even in the mid-1990s, it was at least a decade old and had been worked on by a continually changing group of programmers. You may be thinking, “But, MS Word works well.” Perhaps you use it yourself. It’s stable and gets the job done without a massive amount of computer errors. So how does a high-entropy code base result in a stable product? It’s all about testing. MS has a huge QA team who are relentless in testing. (Well they used to at any rate. These days the quality of testing at MS has been drastically reduced. Which is illustrated by all the problems that have been occurring in the various releases of Windows 10). Throughout the development cycle QA is testing and reporting bugs. Bit by bit, over time, the bugs are largely squashed, and the remaining bugs are deemed to be minor enough that they can live with them. At this point the code is frozen, and a last round of testing is done. If anything new comes up, it is fixed and the software is completely retested. This is because everyone knows that any change could cause a major problem. Any change, no matter how minor, should cause a complete redo of QA. This is just QA 101. This doesn’t necessarily mean they have reduced the entropy of the code. They may have rewritten some areas that were particularly bad, but in general they had just found a ledge on the steep slope of system stability. By hammering at the testing and making sure that all the thousands of bugs are inconsequential or manageable, they can produce a version suitable for release. Packaged software is in effect, certified. We trust software from a reputable manufacture like Microsoft, or Apple, or Intuit. This is because we know they thoroughly test their software. While that doesn’t guarantee you won’t have some problems, they will probably be minor and usually there is a work-around. But notice there is one thing about the testing of package software that really makes it easier. Take testing word processing software, for example. A document is a document. It doesn’t matter what the actual words are, they could be “Lorem ipsum. . . .” What matters for testing is how the software handles the general word processing characteristics of a document, things like: very long words, paragraphs that take up more than a page, multiple columns, pagination, and footnotes. The actual text is immaterial. So, testing package software is not as hard as it could be. Microsoft maintains a library of test documents and word processing actions. These documents form an incredibly rich set of test data that pits the new version of the software against every problem they’ve logged before, and a whole set of specially-generated samples of extreme conditions. An office suite is a set of tools to be used by large numbers of different people to author content, whether it is a presentation deck or a spreadsheet. There is nothing in the software that is specialized for a particular user — everything is general, and hence can be tested with a generalized test set. The features of a package like MS Word are totally determined by the project team. Certainly, marketing gets involved and introduces customer needs and wishes, but if the team is up against a deadline, some of those feature requests get cut in the interest of getting the product shipped. These decisions are made by the project team. Marketing might object, but if the project team says they can’t get it done on time, they will win the argument. If the programmers say a feature has to be implemented in a certain way for technical reasons, they win that argument too. Package software is shipped with a clear set of capabilities. These don’t change until the next release. This means that new features can be done in the time between releases. The data may not be compatible from one release to another and may therefore require an import program to move from an old release to a new one. But there is no requirement to have new feature code work on both old and new data. When you look at the three factors as applied to package software: The project team decides on features. There is always input by marketing but the team has the final say. Testing is made much easier because there is the ability to have a large test suite. New features don’t have to work with old data. Cloud Software In this model of software delivery, the supplier runs servers that are accessible over the Internet (“the cloud”) and clients run software, browser or app, on their local computer (which could be a phone) to display information on the screen and allow input. Clearly one of the characteristics of this model is that the software can be continually updated (which some might describe as cloud software’s biggest advantage and its biggest weakness). However, the current approach to this is to develop a Minimum Viable Product (MVP). This means software with the smallest number of features that is still of enough value to attract a large number of customers. All of the decisions about features are made by the project team. This is a centralized service that is trying to develop features that are wanted by the most customers. This means that features are driven by marketing reasons and certainly not by any outside group. If there are errors they are fixed as quickly as possible and the new code is put up on the website. Any client logging in after that will be using the new code. Because of this fast fix-and-deploy cycle there is a tendency to use the clients as the QA department. No one client is that important, so if a particular client is inconvenienced it isn’t a major problem. Hence, testing is easy because there is not as much of it done. Early users of these services have a rough road, so it is essential that the service being offered is compelling enough that clients will put up with it (the Viable part of MVP). When you look at the three factors as applied to cloud software: Cloud software companies have totally centralized control and they control every decision made about the software. They produce some test data to get started but largely they use their clients for testing. If some features have problems it only affects a subset of the client base. There is a requirement to keep the old data a client is using operational despite the addition of new features. This eventually becomes a major difficulty that can cause massive problems down the road. The hope is that when you get there you will have enough money to solve them. This is probably the second hardest kind of software. It looks easy to start but can become really difficult when the client base grows very large. Also some of these applications are enterprise applications. However, they are generalized applications that force the client to adapt to the service with, for the most part, limited customization. This means that although it is enterprise, control over features belongs to the programming team, which makes things much easier. Enterprise Systems Software Finally, we come to Enterprise software where things are much harder. Once,I was building a system for a large union. At that time I still believed in methodologies, and I had a design document. I wanted to make what the system did and didn’t do very clear. There was a feature that was clearly listed, in large type, that the system would have. All the managers signed off on the design. When the system started to be used operationally, one of the staff came to me and said that she couldn’t do her job without that feature. It was an essential operation to the organization. None of the people involved in the reviews had realized that the feature was essential. not Now what was I to do? I could have taken a hard stand and said that it wasn’t in the deliverables, but an enterprise system is about having a system that works for that particular organization. I had no real choice. If the system couldn’t handle the union’s operations then it would probably be rejected. I had to add the feature. Clearly, I didn’t have total control over features. The control over change rests mainly with the user base. The enterprise system has to allow people to do their jobs and process information. If a feature is missing it may mean that someone doesn’t have the information they need, or that they will have to go through a time-consuming activity to produce it some other way (usually using a spreadsheet). Then there are changes. If a tax change is introduced by a level of government, the organization has no choice but to respond to it. If senior management mandates some organizational change, the system will have to respond. Changes happen all the time in enterprises and they require changes in systems to support the new ways of doing things. In a system such as payroll software, you have to have test cases that cover a good percentage of the possibilities. Think of all the possibilities that a person working under a union contract could generate, given all the combinations and permutations of working: statutory holidays, more than eight hours in a day, shifts after midnight, lunch entitlements, booked salary leaves, garnishees, and so on. Let us suppose there are 20 different such payroll events that are possible for an employee. Each one is either off or on, so the possibilities are a 20 bit binary number which means there are a million of them. Of course, not all combinations are valid and some events are more complicated in that they have parameters. If you don’t want to get drowned in a combinatorial explosion you have to maintain strict orthogonality (independence) among all of these factors. However in practice some of these possibilities are dependent on others, so that makes a strict formal architecture a necessity. This means you need something like a language to describe the operations and the execution flow so that changes can be accommodated. To make things even more difficult, the independence between these factors can be modified by changes. For example a benefit can become taxable so that something that was independent of the tax calculation no longer is. So things that were safely independent are now not, and that can create significant changes in your code. Now if this was an aerospace project you could generate and test a million base combinations with random legal parameters. You could spend a year or two and write a generator for random legal test combinations and spend another year running it with billions of cases until it works flawlessly. To make things easier, in an aerospace application, as opposed to an enterprise application, the relationships between items will not change. That’s the kind of testing that aerospace engineers do — because they have the time and budget. No one wants to spend hundreds of millions of dollars to launch something into space and have it fail. But for an enterprise system that is being built in the midst of continual change and trying to model an organization structure that no single person fully understands, you don’t have that kind of time and budget. Basically, the only way is to do your testing in the guise of installation. What is usually being installed is the package world’s idea of beta software, at best. All you can do is some simple testing, and then turn it loose on real life and fix it when it goes wrong. Let’s be honest here. That what happens in the majority of cases. You can see that in all the other software types the objectives were specified. When I was writing software to help determine the exact direction a radar antenna was pointing, it was clear what was wanted. We wanted the most accurate angle that we could extract from the data, because that increased the quality of the image. That never changed throughout the project. If you are working on package software there is a set of features to implement. In reality some of those features are dropped and some new ones may be added during the build. But it is the team that really determines that. It is the software manufacturer that determines what features are to be included and what ones are not. It is the software manufacturer that determines how that feature will be implemented and it does that with input from marketing, who think they understand what users want, and development, who want to do things in ways that are easier to implement. In enterprise computing what is needed is much less clear. It resides in the minds of the operational staff, who aren’t systems people and often don’t remember something they need until later when they have to perform some task. So, what is being built is not totally clear, and the builders don’t have control over what is wanted; that power is spread out in the organization. This is very difficult to control. On top of this, the business logic can change month to month. This can be the result of some environmental change, such as government regulations, or it can be because of organizational reorganization, or policy changes, or contract negotiations. Consequently, the project team has to deal with constant changes. This doesn’t happen in aerospace programming — the speed of light doesn’t change, Saturn doesn’t suddenly decide to go around the sun in the opposite direction. In enterprise computing the organization’s operational needs dictate the features. As an IT person you can complain all you like, but when the user says “I can’t do my job without that feature,” you are kind of up against the wall. In enterprise, unlike other kinds of software, you are trying to build new software, not knowing what features you may be required to implement during the project. If the project takes longer than you have estimated (surely that never happens!), then the number of new features inevitably goes up, because all organizations change over time. When you look at the three factors as applied to enterprise software, you can see how they add to the difficulty of these systems: Enterprise systems are trying to home in on what works for the organization. The features needed are not really known until the system is used. Testing is really difficult and is usually mostly done by the users when they start to use the system. Once an organization has switched to a new system everything is an emergency. That means that design is ongoing and if there is already a large amount of entropy these problems could put it over the edge. Business rules can change from period to period. However, all the old data has to be kept operational as it is combined with new data to produce business reports. This results in a complexity that just isn’t there in other kinds of software. It is like fixing an airplane while in flight. Summary Enterprise systems have some properties that other software simply doesn’t have. Control over features doesn’t reside solely with the project team; the team is at the mercy of people in the organization who are trying to communicate what they need but don’t always know what that is until later. Testing is almost always done on the live system. Once the system is running, the code has to be able to respond to changes in the business logic, and yet keep all the past results. As a programmer, it is much easier to work on other kinds of software. Trying to develop software while keeping it in sync with the activities of a changing organization is really, really hard. If you don’t believe me, just try writing one of these things. Enterprise software is the hardest software to write. This story is adapted from my book Avoiding IT Disasters: Fallacies About Enterprise Systems and How To Rise Above Them Previously published at https://codeburst.io/enterprise-software-is-the-hardest-software-to-write-c76d59725f3