Sweating the Small Stuff

Performance tuning small transactions

Nov 18, 2022

When benchmarking my new data management system called Didgets against mainstream databases like Postgres, MySQL, or SQLite; I typically run queries against tables with many columns and millions of rows. Many of the queries return a large number of rows and some of the demo videos show how my system is able to do it in a fraction of the time it takes other systems to return the same results. There are many queries where Didgets is 10x faster or more and it is able to do this without requiring separate indexes to be created against the tables.

But a large percentage of database queries do not return a significant number of rows. Instead of looking for the contact information for all the customers who live in California, for example; a customer service representative is often just looking for customer #123456. A database must be able to process thousands of such requests in a continuous fashion. The faster a database is able to process these small requests and also do updates; the better the system will perform even on cheaper hardware.

I realized that in all my benchmarking, I neglected to compare my system using these queries that only return a single row in a large table. So I loaded the Chicago crime data (7.6 million rows) into both Didgets and SQLite to test it. Using the latest DB Browser for SQLite tool, I recorded how long it took to find a single crime report. For example, there has only been one crime reported on the ‘118XX S BELL AV’ block over the past 22 years (must be a relatively safe place). Without an index on the table, it took SQLite 2142 ms (just over 2 seconds) to return the crime data for that block. Once I created an index against the ‘block’ column, the query sped up substantially to now take just 3 ms. Clearly, creating an index in SQLite was a tremendous improvement even though it can be a pain to create and maintain.

I then loaded the same data into Didgets and ran the same query. It took 5 ms. This was a great disappointment to me that it was actually slower, so I began the investigation. After spending some time stepping through code in the debugger, I discovered the reason why and fixed the problem. I re-ran the test and it now showed 0 ms which means it took less than a single millisecond. Since my tracker was only granular to the millisecond, I had no idea how much less than a millisecond it was. To determine that, I modified the code so that it would run the query 100 times before recording the time. It then took 31 ms to run which meant that each individual query took just .31 ms. This means that Didgets is about 10x faster than SQLite at queries that just return a single row in a large database table.

Now there is some question about whether the DB Browser for SQLite tool measures the query time the same way my Didget Browser tool does. Didgets starts recording when the query is passed to the engine and ends when the result set with all the column values is returned. I don’t count the time it takes to display those values in the window for the user to see. This should only take a fraction of a millisecond to display so I doubt it would skew the results very much, but it could affect it slightly if the SQLite tool is counting the display time.

I am now very confident that Didgets can greatly outperform any database for queries that return a single row as well as for queries that return thousands or even millions of rows. I would think that its ability to do that without needing the database administrator to guess which columns in a table are likely to be queried and have to create indexes on all of them; would be a major selling factor. People will tell you all the time that speed is extremely important, but only time will tell if that is really true. So far, my videos have been seen by a few thousand people, but very few are knocking down my door trying to figure out why it is so fast.

Didgets

Sweating the Small Stuff

Performance tuning small transactions

Discussion about this post