Multi-faceted Systems

Where do you focus limited resources?

Oct 05, 2022

Many inventions end up being used for things that their inventors never imagined. The inventors surely dreamed of at least a few ways in which their creation could be put to use; but some of the best inventions get used in ways their inventors just could not anticipate. This is especially true for things that have a large variety of uses.

Take smart phones for example, which are now ubiquitous and have many uses. A smart phone is a mobile telephone. It is an instant messaging device. It is a digital camera. It is a video recorder. It is a navigation device. It is a streaming music and video player. It is a portable computer capable of running thousands of mobile applications. It is an internet browsing device.

If you asked a thousand users to rank which features are the most important or which feature they would give up if one had to be eliminated; you would get a wide variety of answers. Certainly some of the typical smart phone’s capabilities were envisioned by its inventor(s); but like personal computers and the Internet, no one could have predicted what their potential was when the first ones were introduced. Developers are constantly finding new ways to use them so they could still be used quite differently than they are now, in the not-so-distant future.

Because so many of the smart phone’s features are inter-related; the overall value of one of these devices is much greater than just the sum of its parts. The video camera is used by conference applications to conduct on-line meetings. The GPS coordinates are used by much more than just the navigation applications. We never know what next innovation will become tomorrow’s indispensable feature.

Now imagine that you were the inventor of the smart phone and wanted to introduce it into the market where phone, cameras, and video recorders already existed. Imagine further that instead of being a huge company like Samsung or Apple, you had a small team with limited capital. You managed to get several of its features minimally working, but all of them still needed substantial work to get them to today’s level and some features were still on the drawing board. Given your limited resources, which features would you focus on to try and gain some traction in the market? How would you try to market it to an audience that had trouble seeing how it was so much different than their existing alternatives.

This is the dilemma that I currently have with my Didgets project which is a new kind of general-purpose data management system. It has some things in common with other data managers like file systems, relational databases, and NoSql solutions; but its design and implementation is vastly different than each of them. It can do a number of tasks well already and the speed and ease-of-use for each of those tasks is superior than other systems; but almost none of its features are yet fully implemented and thoroughly tested in extreme data settings.

It is difficult to know which area to focus attention on in order to gain some significant market traction. Like the smart phone, Didgets can appear to be quite different depending on which area you look at. One day it is a simple data analytics platform. The next it is a indexing service. Still later it is a content management system. This can be confusing to anyone wanting to know what it is and what it does.

I will first try to describe the system by giving its grand vision of what I think it is capable of becoming once all features are working as designed. I will then try to break down each significant feature set and discuss the problem it is trying to address. Hopefully, this will give the reader a good understanding of what it can do now and where it could be going.

The Grand Vision

The system is designed to be a global distributed data management system called the Didget Realm. Just like the World Wide Web, this network will consist of millions or even billions of individual ‘nodes’ that can communicate with each other to share and manage all kinds of data. Unstructured data, highly structured data, and semi-structured data can all be managed by a single coherent system.

Each node consists of a logical container (called a Pod) that holds all the data objects (called Didgets - short for Data Widgets); the control software (Didget Manager); and an API for creating and manipulating Didgets. Each node is capable of handling many millions of individual Didgets and can quickly find groups of them based on their types or the meta-data tags they have attached. The software is compact and scales very well. This means it can run on simple devices with limited hardware resources and also take advantage of advanced hardware to manage vast amounts of data.

Within each node, there are a variety of different Didget types; each designed to manage a specific kind of data. Like a set of Legos or an erector set; various Didgets can be combined to form more complex systems. Hierarchical file systems, relational database tables, indexing services, and logging frameworks are just a few examples of this.

Below is a list of the features that have been at least partially implemented:

Database - Relational tables containing structured or semi-structured data can be quickly created and queried. A table may be small with just a few columns and a few dozen rows; or it can consist of thousands of columns and hundreds of millions of rows. Each table consists of a set of key-value stores (it is a columnar store) along with the schema and a list of row keys. The schema is flexible so new columns can be added or existing columns can be converted or deleted without exporting and re-importing the data. Each row/column intersection in the table can have zero or more values assigned. Nulls are free, but each column can also handle multiple values mapped to a single key so each table can have a third-dimension.
Data Analyzer - The data within each relational table is organized so that analytic operations can be performed in record time without a complex ETL operation. The structure of the tables allow for fast transaction processing while analytic operations are also being performed. Pivot tables against huge relational tables can be created very quickly. Charts and formulas can be used against entire tables or specific subsets. The results of any analytic or query operations can be exported to external files in a variety of formats (CSV, Json, HTML, XML, etc.)
File Organizer - Each Pod is capable of holding hundreds of millions (or even billions) of Didgets. Queries to find subsets of those Didgets can execute in just a few seconds. Like with file systems, Didgets can be organized in a set of hierarchical folders or other sets like photo albums or music playlists can be used to organize groups of Didgets. The key-value stores used to create relational database tables are also used to form a tagging system for other Didgets. Every photograph, document, or piece of software can have a set of contextual meta-data tags attached. Finding Didgets based on their tags is a base feature of the system.
Data Cleaner - Relational tables often have data errors. Values that are misspelled, missing, or just wrong can cause query errors and also affect analytics. The Didget system makes it easy to find and fix these errors by making it apparent where the outlier values are and allowing the user to transform or correct many values at once.
File Indexer - Finding files quickly based on their contents is also a problem when large numbers of files are present. The Didget system is able to create indexes based on file contents and help the user quickly find and organize files based on the values they contain.
Logging Framework - The same structures used to create relational tables can be used to store and manage log data. The data can be organized and queried in record time to help find software problems, security breaches or network issues.
Configuration Manager - Managing large numbers of software packages can be problematic. Didgets provides a framework for storing and managing configuration data for thousands or even millions of programs.
Content Manager - Keeping track of content (software packages, music, photos, and other content) and insuring that it is properly licensed is a daunting task. Didgets makes it easy for content owners to publish and distribute content. Users get a clear mechanism for determining what is available, how much it costs, and what restrictions it might have. Content owners can provide incentives for upgrades and provide fixes or improvements for their audiences.
Duplicate Identifier - Duplicate files within the same container may take up significant storage space. Some large files may be stored hundreds of times, each with different metadata but having the same contents. Didgets makes it easy to find which files have been duplicated and to quickly find all the copies of a certain file.

Each of these features can be a compelling reason on their own to use the Didget system. But like a smart phone, having a variety of features that work well together can change everything. The software is currently in beta and is available for free download from the Didgets website.

Lefty

Oct 5, 2022

“It is a portable computer capable of running thousands of mobile applications.”

It’s a dumb terminal mostly useless unless connected to a mainframe (cloud).

It runs important software apps like putting fruit hats on pets.

Most importantly it is a surveillance tool, and a psychological warfare tool that points an AI directly at your brain to change your thinking and behavior.

Expand full comment

Didgets

Multi-faceted Systems

Where do you focus limited resources?

Discussion about this post