If you watch the news, every once in a while there is a story about a serious software bug that takes down a major system somewhere. It might be in an airline’s reservation system, a stock brokerage’s transaction system, or with a social media platform. Every hour the system is down may cost $ millions in repair and lost opportunity costs. The company goes into panic mode until the problem is found and fixed. Afterwards, policies and practices may be added to ensure it doesn’t happen again.
But there is another kind of bug that does not make headline news and does not stir a panic attack, but nevertheless can cost as much or more than the highly publicized kind. These bugs may lay hidden for years or even decades with little or no incentive to find and fix them. I am talking about inefficiencies in commonly used software that slowly waste time, bandwidth, and electricity over time.
A typical bug might be in a poorly designed function that is a part of a common software library or operating system. It might be in a Microsoft Window’s disk driver; an Apple MacOS network stack; or in an open-source string library. It is so commonly used that it is called billions of times each day across tens of millions of computers. Each time it is used, it wastes several computer instructions. It burns a few extra watts of electricity during the day and it generates a little more heat. It also loses precious time, even if it is measured in milliseconds.
The costs associated with such a bug for any one user may be quite small. Maybe it only adds a few cents or even a whole dollar to the monthly electric bill. If it was fixed, any single user might not even notice the slightly smaller air conditioning costs. No one person or company is feeling the pain so the problem persists in perpetuity. But the collective costs of the problem, when spread over millions of systems, can really add up over the months and years.
It is easy to understand how these bugs get created and escape detection for such long periods. Some engineer was either inexperienced or in a rush to get it coded years or even decades ago. It worked so no one bothered to inspect it further. Maybe it ended up being used for something entirely different than its original purpose.
Finding and fixing it might only take a skilled engineer an hour or two, but the incentive is just not there. The company that released the software does not have to pay the electric bills for its use, only the engineer’s salary to fix it. No important customer is demanding that it be fixed. Also, changing code always has its risks, so the ‘if it ain’t broke, don’t fix it’ mantra often applies.
But the economics follows a similar pattern to that of using more efficient light bulbs. The typical incandescent light bulb is really cheap and burns about 60 watts of electricity. Newer CFL or LED light bulbs use a fraction of that amount but cost a little more and it can be a pain to buy and replace existing bulbs. Replacing a single bulb that is usually on for a couple hours each day will only save a homeowner a few cents a month. It can take several months before it pays for itself in saved electricity costs. But if you multiply that over 30 bulbs and several years, the savings can really add up.
Just like with energy efficient light bulbs, it takes public awareness to get people to address the problem. Programmers need to have efficiency as a primary goal when coding new functions or refactoring existing ones. Companies need to make it a priority just like they do for other environmental or social issues.
There is strength in numbers. One person picking up a few pieces of litter may not see much change for a neighborhood or city, but a lot of people chipping in can make a world of difference.