Human history full of droughts and floods; famines and bumper crops; and boom and bust cycles. Even in modern times when we have reservoirs to store water for dry seasons; food preservation and distribution systems to offset lean years; and savings plans for retirement; we still encounter times when certain resources are scarce.
The recent Covid-19 pandemic with its forced closures and travel restrictions; caused disruptions in supply chains so that certain goods and services became difficult to find and expensive when located. Sometimes car deliveries were delayed because of a single part that was in short supply. The laws of supply and demand will often cause prices to adjust behavior and bring them back into balance; but that can sometimes take quite awhile to occur.
Our behavior will often change based on whether something is abundant or whether it is scarce. When gas prices are low, people will often buy less fuel efficient vehicles. When energy is plentiful, we don’t worry about leaving the lights on when we are gone. When we are on a trail in the hot desert, we don’t let a drop of water go to waste.
It can be difficult to get people who live in a dry climate to conserve water in years when rainfall is way above normal. It can be difficult to get your children to not waste food when they have never experienced hunger. It can be hard to stick to a budget when you are wealthy.
The same is true with computing resources. When the first computers came to market, they were expensive and things like memory and disk space were scarce. Programmers had to make sure every byte of code was performing optimally in order to conserve memory space or limit disk accesses. Networks were extremely slow so extra care was taken to make sure every packet was used optimally.
When modern processors became very fast and memory and storage became abundant and cheap; programmers stopped being so concerned about making optimal use of those resources. Who cares if a function takes 10x as long as it really should or if a data structure takes 100 bytes more than it really needs? The function still runs in less than a millisecond (1/1000 of a second) and the extra bytes are not even noticed when you have many billions of bytes of memory available.
But just like when millions of people all waste some valuable resource causing a widespread shortage; data can grow to cause real problems with inefficient code. A function might appear to finish quickly when processing a list of 10,000 items; but throw 10 billion items at it and watch it crawl to its knees. A list of a million records runs fine on a normal computer; but when the number of records exceeds a billion then all the memory is depleted and swaps to disk cause the process to grind to a halt.
My parents grew up in the aftermath of the great depression. They were taught at a young age never to waste anything. They were drilled on the ideas of “A penny saved is a penny earned” and “Waste not, want not”. Even when times were much more plentiful, they had developed habits that caused them to conserve everything they could. When my father recently died, we found the house full of extra things that were hoarded ‘just in case’. Some of my parents’ conservation habits were passed down to me. I have trouble instilling those same tendencies in my children.
Programmers like me, who started their careers when computing resources were very low and expensive; developed habits that die hard. We will spend hours fine-tuning a function or making some data fit within a small space; even when it appears unnecessary. These habits really pay off when ‘Big Data’ comes calling. Huge amounts of data thrown at some code can break it if efficiency was not a serious design consideration.
My Didgets project was designed expressly for that purpose. I often ran it on old, slow hardware to insure it worked as fast as possible even when resources were scarce. I threw huge data sets at it and continually tweaked it until it could do lightning fast queries even when there were hundreds of millions of files or database table rows.
If a query runs very fast even when a table is huge, then the software performs exceptionally well on smaller data sets. The overall system is much more responsive and can handle many tasks simultaneously when all the software running on it have been designed and built with efficiency in mind. Even if your data needs are modest for now, you will be glad you went with something like Didgets when the data flood inevitably comes your way.