Where does all the bad code come from?

Some of the top people in the software field spend a good deal of their time examining and improving the quality of existing code bases, and showing developers how to keep their code bases habitable.

Brian Marick kindly filled in a historical gap for me in response to the initial version of this post. He writes: “‘habitability’ was probably coined by Richard P. Gabriel in an article for Journal of Object-Oriented Programming. The article (‘Habitability and Piecemeal Growth’) is included in his 1996 book Patterns of Software (available at https://dreamsongs.com/Books.html). Thanks for the history lesson, Brian!

It’s worth noting that “legacy” doesn’t automatically mean “bad”. Code that is currently used by numerous people, companies, and government agencies to support activities that are important to them clearly brings value. That value is part of the legacy of the code. But we need a shorthand way to categorize code whose internal design could benefit from improvement, and the word “legacy” has penetrated the market in that sense.

One might expect that after all these years of harping on “code quality” and “clean code” and “software design principles” that the problem of poorly-designed code would have faded into the background by 2020. Sadly, the problem is more profound than ever.

Why might that be the case? I think there are four main reasons.

Cause #1: Misalignment between education and industry

There is misalignment between the education sector and the demands of industry for software-related skills. People study Computer Science at universities to prepare themselves to enter the job market for application programming. But Computer Science is not application programming. Students are never taught how to design, build, and support application software that must work for a wide range of user needs and operating conditions “in the wild”, that lives for many years, and that must be understandable by someone other than the original author.

Compounding the problem for business application support is the fact the top of the class graduates from Computer Science programs are attracted to companies that do “interesting” things with technology, while the majority of application programming jobs are in companies that do “mundane” things. Those companies never even see the “best and the brightest” of Computer Science graduates. The result is evident in the quality of their application code.

Cause #2: Quantity over quality in education and training

In a talk dating from 2016, “The Future of Programming”, Robert C. Martin mentions (among other things) that the number of programmers working in industry has doubled every five years since about 1970. That situation has led to a “rush” to prepare people to enter the software job market. Martin points out that at any given time from 1970 to 2016, half the people in the software industry had under 5 years of experience. That continued until 2020, when the Coronavirus pandemic struck.

The fact this situation has prevailed for so long has generated an industry around technical training and job preparation. Everything about this industry is geared for cranking out entry-level software developers in large numbers, year after year. The programmer factory employs many people besides the programmers themselves, including large-scale training operations and recruitment firms. It comprises not only university Computer Science programs, but also technical colleges and bootcamps, which are more practically-focused and yet still cannot produce well-qualified people in a short time. Most of them learn just enough to be dangerous.

One result: The problematic app developed for the US Democratic Party’s Iowa Caucuses in 2020. Popular press articles focus on various political angles, but I’ve read that on the technical side the poor quality of the code resulted from assembling a team of first-time programmers led by a single “senior” engineer.

There’s another variation on this theme: The strong push for “test automation”. Many people are learning to code (rather than learning to develop software) so that they can write executable test cases. Many companies are now supporting internally-built “test frameworks” that are, themselves, fairly complicated applications. (I did some refactoring on such a code base just this week.) Most of this code is unstructured and follows no consistent naming conventions or coding style standards. The freshly-trained “coders” have no awareness of fundamental software design principles.

Java has been strongly promoted as the language of choice for this type of work. Java is an object-oriented language, and yet I defy you to show me a “test framework” written in Java that reflects object-oriented design principles. There is a rapidly-growing pile of test code that is not habitable, to accompany the already-large pile of production code that is not habitable.

Anna Filina, a consultant who specializes in remediating legacy code bases, has observed that every code base she encounters exhibits the same short list of very basic design issues. She tweets prolifically, so if you’re interested in this topic, follow her on Twitter: @afilina.

It’s as if people are hacking up code in the same way all around the world. And it all boils down to a lack of understanding of basic sotware design principles; it isn’t that difficult to get right. So why do so many people get it wrong in the same ways?

Clearly, there’s a common denominator behind the commonalities in “bad” code. I think the first two causes I list here reflect that common denominator: Education and training is not focused accurately on job preparation, and rushing people through the training doesn’t prepare them well for the work they will be doing.

Cause #3: Explosive growth of open-source hardware

The “maker” community is vibrant, innovative, open, friendly, and very active. People are designing and building all sorts of interesting, clever, fun, and (sometimes) useful things. The “industrial internet of things” (IIoT) sector is taking off in a big way, using the same approaches and techniques as the hobbyists to create new markets.

The community is carrying forward one of the salient characteristics of traditional engineering – a general distrust of software. If they can bake functionality into the hardware, they will. If they must resort to software to support certain functionality, then they write the software more-or-less as an afterthought. The code that lives within these well-engineered devices is not, itself, well-engineered. The habit of hacking up code quickly and carelessly lives on in this new generation.

And it is a prolific generation. Between hobbyists who create useful things that accidentally become real products, and startups creating IIoT solutions on purpose, the mad rush to write code quickly and without software engineering discipline is not slowing down.

Cause #4: Software developers accept being treated like manual laborers

Organizations that support business application systems are often under market pressure to deliver software quickly. A competitor who gets a desirable feature to market first enjoys a business advantage over other players in that market.

This leads to pressure on software development teams – real or perceived – to rush their work. Everything must be done faster and delivered sooner. How fast is fast enough? How soon is soon enough? No one knows. It isn’t a question of measurement. It’s just a frenzied rush.

There are two problems here, and they reinforce each other. First, there’s the problem of managers not understanding the realities of software. They think it’s possible to cut corners to deliver faster and sooner, when in reality that approach only slows things down. Second, there’s the problem of software developers wanting to be seen as “professionals”, but unwilling to act as professionals.

Imagine civil engineers who are designing a bridge that will carry passenger vehicles across a river. They are approached by the prime contractor and told to get the bridge up by Thursday no matter what. Doesn’t matter if the bridge is still standing on Saturday, and also doesn’t matter how many full school buses fall into the river when the bridge collapses, as long as we meet the Thursday delivery deadline.

What would professional engineers say in that situation? You’re right, of course that’s what they would say. So, why do “professional” software developers not say the same thing? They are only perpetuating the problem when they don’t act as professionals.

Things to come

The economic impact of the Coronavirus pandemic is expected to be long-lasting. One of the immediate effects has been to shrink the job market for people in the software field. Layoffs occur daily, and there is virtually no hiring going on. The doubling of software development jobs every five years has come to an end.

In the meantime, the training juggernaut hasn’t slowed down. Some people haven’t come to terms with the change in the job market yet. They’re still cheerfully advising young people to learn “coding”. Those who have recognized the slowdown have discovered there are no brakes on the training train. The whole training industry is optimized for doubling the market every five years.

So, here’s what I predict. If you’re interested in remediating legacy code, there will be more work available than you can possibly do in the course of a human lifetime. At the same time, there will be a glut of software people on the job market. Hiring managers, HR people, and recruiters can’t tell the difference between you and the average bootcamp graduate. So get ready to live modestly. The days of high salaries for software work are over.

In fact, it’s very possible that the only opportunity you will have to work with software will be as a hobbyist. The unemployment rate is at an all-time high, the current economic slowdown will last long enough to cause permanent damage to the economic system as a whole, and when the recovery comes things will never again be as they were before 2020.