If you want to be an expert programmer, one of the most powerful skill sets you can have is knowing how to build quality software. With high project failure rates in software — over the 50% mark according to some reports — developers that can:
stand out above the majority.
Study after study, and years of experience have shown that when software teams put quality first, everything else falls into place:
Software quality has many aspects. The most commonly discussed aspects of software quality by developers are Tests:
Of these, for some reason, Unit Tests get the most attention.
Forget about Unit Tests.
Construction-stage testing has one of the lowest impacts on actual product quality(1). Don’t get me wrong. Keep writing unit tests. But know that Unit tests are small fish in a big sea. If you really want to write quality software, your focus should be somewhere else.
In this post, we’ll look at Validation and Verification.
Unfortunately, there is some confusion between the terms Validation and Verification in software engineering. Here’s how I tell them apart:
Verification is where testing comes in — testing, and several other, often overlooked, but very valuable practices.
Studies over the last 25 years have proven that it pays to remove defects early. Organizations have found that purging requirements and architecture errors before detailed construction begins reduces rework costs by 90–99%, compared to correcting those errors during system test or after release. — Steve McConnell
As a professional software engineer one of your most important tasks has nothing to do with engineering. You’ve got to understand “The Business.” As I mentioned in a previous post, an engineer’s primary job is problem solving. To successfully problem solve, you have to understand your problem domain. For most software engineers, that means understanding the business their company is in.
Why is this important? The vast majority of product errors occur at the requirements and construction stage of development. And, the problems that enter at this early stage are the big ones. Build the wrong thing, and your business fails to achieve its goal. In the best case scenario, you’re looking at rewrites. There’s nothing more expensive, and error prone than a rewrite. In the worst case scenario… it’s time to dust off your resume.
If you’re lucky, you work in an environment where you have input into the development of the stories or tasks assigned to your team. If so, your story brainstorming session is one of the most important parts of the entire software development life-cycle. When you understand your business and your customer, you’re prepared to spot flaws in a product design. You can contribute to protecting the integrity of your application. You can help prevent the biggest, most problematic issues from impacting your project.
At this stage, the goal is ensuring that the feature(s) you’re developing will benefit the company. Most of the time, this means adding a benefit to the end user.
To be successful at this stage:
Prototype. Create a quick, functional prototype and get people using it. Quality doesn’t matter much here. It just has to work. You’re focusing on the design and UX. Iterate on the design. Users often don’t know what they want until they can touch it. This is one of the reasons by rapid, evolutionary prototyping is so successful.When you have something that users love, build it — or refactor and harden it.
When you’re ready to build, it’s time to focus on some of the standard quality tasks you’re used to.
The most effective known methods of eliminating defects… include requirements models, automated proofs, formal inspections of requirements, design, and code; and static analysis of code and text… These methods have been measured to top 85% in defect removal efficiency individually. — Caspar Jones (2)
There’s too mind space spent on Unit Tests, when there are other software measures that have been proven to be more much valuable. So, for the moment, forget about Unit Tests. Let’s explore some other options:
. Design:
. . . Modeling / Specification
. . . Contracts
. Types & Proofs
. Reviews
Thinking doesn’t guarantee that we won’t make mistakes. But not thinking guarantees that we will. — Leslie Lamport
Do some design up front — before you code. Step away from the computer. Draw out a diagram. Have a conversation about the problem. Think about the problem, and try to understand it. Check your thinking with your team. Iterate. So many problems, so many bugs could be eliminated if developers took the time to think through their designs — rather than “designing at the keyboard”.
I worked, recently, on an application that involved navigating a user through a series of steps. One of the devs on the team implemented a complex Wizard component to control the flow of navigation. They developed this solution without consulting the rest of the team. It was a case of designing at the keyboard. Now, weeks later, we’re looking at refactoring the Wizard out of the application because it’s adding unnecessary complexity. If the developer in question had reviewed their designs with the team in advance, this problem might have been avoided.
When we think through our designs before we code, our code can’t help but improve. One of the tools in use by engineers who must design mission and life-critical software is specification languages. Examples of such specification languages include: ACSL, Spec#, Eiffel contracts, JML, and SPARK contracts (see Contracts below). Another kind of a specification language doubles as a programming language. Examples include: Coq, Agda, WhyML, and Idris (see Dependent Types & Proofs below). Finally, there are stand-alone specification languages. These are not complete programming languages. Some examples are: Z, VDM, B-method, ASM, and TLA+.
Modelling / specification and simulation are standard practices in engineering. Over the years different modelling techniques have held sway in software — everything from UML to napkin drawings. Some modelling techniques have been clearly shown to be very effective — TLA+ is a good example.
TLA+ is a formal specification and verification language. It helps engineers design, specify, reason about and verify complex software systems. According to Ron Pressman:
TLA+ has been successfully used by Intel, Compaq and Microsoft in the design of hardware systems, and has started seeing recent use in large software systems, at Microsoft, Oracle, and most famously at Amazon, where engineers use TLA+ to specify and verify many AWS services.
However, modelling at this level, while incredibly valuable is not always worth the cost. That said, not modelling anything is also expensive. Finding a balance for your application — knowing when it makes sense to spend the time doing formal modelling — is very valuable. How do you do it? The more mission critical your functionality, the more it pays off to spend the time to specify and verify.
Design by Contract (DbC), like TDD, is a design process. It helps you to design better code. It works like this: before you implement your functions, you explicitly define the inputs, and output of the function, as well as the states of the input and output values before and after the function. These definitions, often called assertions, are written directly into the function. They ensure that the function, and any callers, comply with the contract. If they don’t, the assertions stop the execution of your code.
Having the asserts in your code has several distinct advantages.
Inline, explicit requirements documentation
It’s often said, the Unit Tests add value because they document the functionality of the features you’re testing. This is true. However, to get this documentation you have to context switch between you function and the Unit Tests. Using DbC, all of these requirements are inline with the function code.
Testing of all inputs
Unit Tests only test the functionality and inputs you were able to think of. In fact, their ability to cover the code is quite limited:
Unit tests are unlikely to test more than one trillionth of the functionality of any given method in a reasonable testing cycle. Get over it. (Trillion is not used rhetorically here, but is based on the different possible states given that the average object size is four words, and the conservative estimate that you are using 16-bit words.) — Jim Coplien (emphasis added)
So much for code coverage!
Given this, the fact that DbC tests ALL of your inputs is pretty significant. (Not incidentally, Coplien has recommended DbC over TDD for this very reason).
Testing of relationships between components
Unit tests only test an aspect of functionality in isolation. Most bugs don’t occur in isolation. They occur when units interact. So, most developers who write Unit Tests will write integration tests. And, in writing integration tests, you’re likely to run into similar issues as you did with Unit Tests: How many interactions can you think of to test, vs how many interactions are likely? With DbC, you’re testing interactions by default.
Immediate application failure when a contract is violated
The faster a piece of code fails, the easier it is to debug. If your application has a bug that occurs five functions down the stack from its point of origin, it can be hard to identify. How much easier would it be if your application failed immediately, right where the problem was? DbC fails fast. Right at the source of the problem.
Static Types
Static types seem to be all the rage these days — especially in the front end. The Elm documentation claims that due to its use of static types, “users do not see run-time errors in practice.” Flow’s documentation states that, “Flow identifies problems as you code”. FP Bloggers contend, “A large class of errors are caught [by static types], earlier in the development process, closer to the location where they are introduced,” and, “Types guide development and can even write code for you!” The claims go on and on.
Other sources contend that all the noise about static types is a bit overblown (see, also, here and here) or fails to take in to account the costs. Importantly, no one is saying that static types aren’t helpful — just that there’s more to the story.
The following, now famous, research analysed the bug density of projects on GitHub by language. To quote the report: “The charts show no evidence of static/dynamic typing making any difference, but they do show… a gap between languages that focus on simplicity versus ones that don’t.”
Languages sorted by bug density. Repos with more than 100 stars.
Research, by Gao, Bird, and Barr, which analysed the ability of Flow and TypeScript to identify bugs, showed that Static Typing could reduce the presence of bugs in JavaScript projects by 10–15%.
…at the confidence level of 95%, the true percentage of detectable bugs for Flow and TypeScript falls into [11.5%,18.5%] with mean 15%… Flow and TypeScript largely detect the same bugs. — Gao, Bird, and Barr
Adding static types to a JavaScript project is not without its problems (as noted above). In my experience, for example, Flow is a bit of a kludge. It’s kind of like Monkey Patching JavaScript to have types. It can be quite painful to work with. However, as Gao, Bird, and Barr state, “even small changes in the number of checked-in bugs can be quite valuable.”
Dependent Types & Proofs / Formal Verification
A language is statically typed if the type of a variable is known at compile time. Dependent types allow types to be dependent on values. This means that some aspects of a program’s behaviour can be specified in the type.
Formal Verification (proofing) is the act of proving or disproving the correctness of an algorithm’s compliance with a specification using mathematics. This is a very large field in Computer Science, which I will mostly gloss over. I mention it here only in its relation to dependent types.
Languages with dependent types allow what’s known as end-to-end verification. End-to-end verification is the ability to specify and verify the behaviour of a program from its highest level of abstraction down to the machine instructions, ensuring that the executable conforms with the specification.
By programming with dependent types, it is often possible to prove theorems without writing anything that looks like a proof. Instead, this work is done via type-checking. Dependent types are especially useful for defining data in a way that prevents the construction of any invalid object.
This is incredibly valuable. First, the types provide a specification of functionality that is in-line with the code (a benefit, also, of Design by Contract). Second, it allows for integration proofs — the ability to prove that an application conforms to its specifications.
Unfortunately, this reliability and readability come with a high price tag. According to Ron Pressler, languages with Dependent Types, “are extremely complex, requiring months to learn and years to master. For this reason they are only used by specialists, very rarely in industry, and virtually never without the support of academic experts.” Darn.
In the end, it’s quite likely that there is value in adding static types to a JavaScript project. Even if the benefit is only 15% of a class of bugs, those are still bugs you don’t have to worry about. The costs of implementing static types is relatively low.
A side benefit of using static types in JavaScript is that you might write faster code. V8, SpiderMonkey, and Chakra make optimizations for functions and Objects with predictable types.
While using dependently types languages sounds really cool, the current state of things suggests that it is not practical. This does not mean that we cannot benefit from the concepts they employ at all. For example, by combining Design by Contract, Static Types, and Unit Tests we get many of the benefits.
According to reports by the IEEE, reviews can remove up to 90% of errors from a product — before you run the first test case(2). As noted above, Jones, et al., have found similar results from reviews:
formal inspections of requirements [combined with other methods]… have been measured to top 85% in defect removal efficiency individually. — Caspar Jones (3)
Reviews aren’t just code reviews. Reviews should happen at several levels:
It’s also better when several people perform the reviews. McConnell reports that researchers found only a 20% overlap in the defects found by multiple reviewers of the same content(4).
If you’re doing code reviews, that’s great. How many people have to approve code before it’s reviewed? How formal is your code review process? Are you doing requirements reviews? Are you reviewing documentation (if you have any)?
To paraphrase Fred Brooks:
Well over half of the time you spend working on a project (on the order of 70 percent) should be spent thinking. No tool, no matter how advanced, can think for you.(5)
Thinking comes first. Understand the business. Understand the user. Understand the problem. Think about and iterate on the solution. Use processes that allow you the space to think. Use languages and tools that help you think. All these in concert will help you produce high-quality work.
Notes
(1) https://www.seguetech.com/rising-costs-defects/
(2) ibid.
(3) Software Defect Origins and Removal Methods, Jones, 2012, pg 6.
(4) Code Complete, table 20–2.