High Performance
SQLite
Introduction
Introduction to this course
Introduction to SQLite
SQLite is safe
It's just a file
Running SQLite locally
Good uses for SQLite
SQLite's limitations
The documentation
SQLite Internals
SQLite's structure
SQLite's file format
Dot commands
Pragmas
Virtual tables
Schema
Flexible types
Types
Strict types
Dates
Booleans
Floating point
Rowid tables
Auto increments
Without rowid
Generated columns
Optimizing SQLite
Locking
Rollback mode
WAL mode
WAL vs Journal benchmarks
Busy timeout
Transaction modes
Vacuum
Analyze & Optimize
Suggested pragmas
Faster inserts
Indexes
Introduction to indexes
B+ trees
Primary, secondary, and clustered indexes
Without rowid tables
Benchmarking without rowid tables
Imposter tables
Primary key data types
Where to add indexes
Index selectivity
Composite indexes
Composite ordering
Covering indexes
Partial indexes
Indexes on expressions
Automatic indexes
Duplicate indexes
Indexing joins
Advanced SQL
Explain
Index obfuscation
Joins
Subqueries
Unions
CTEs
Recursive CTEs
Window functions
Dealing with NULLs
Row value syntax
Indexed sorting
Upserts
Returning
Aggregates
Triggers
Operating SQLite
Altering schema
Altering schema with tools
Multi-database
Multi-tenancy
Backups
Exports
How to corrupt SQLite
Advanced JSON
Intro to JSON
JSON vs. JSONB
Argument types: path, value, JSON
JSON5
Valid JSON
Creating JSON objects + arrays
JSON Extraction
Updating JSON
JSON aggregates
JSON table functions
Indexing JSON
Full Text Search
Creating FTS tables
Performing searches
Ranking results
Highlighting matches
Bonus Interviews
DHH discusses SQLite in Ruby on Rails
Distributed SQLite with Litestream and LiteFS
Offline-first, multiplayer SQLite
Production SQLite with Turso and libSQL
Migrating from Postgres to SQLite with Kent C. Dodds
Ruby on Rails with SQLite
Moving from Redis to SQLite with Mike Buckbee
Locked video

Please purchase the course to watch this video.

Video thumbnail
Indexes
Benchmarking without rowid tables

Full Course

$
179
$229
USD, one-time fee
Some people stray away from a course like this because they find it too niche, but really there are great takeaways that you can apply to other databases. There are sections that cover optimization like how to treat indexing and how to write better queries SQLite is also relatively smaller, so it makes it easier to digest and really understand the engine that drives your application layer. I think this course really hit the nail on all points and I highly recommend it to anyone even curious about databases in general. Also of course the Aaron Francis/Try Hard touch always makes it a great time and interactive.
Dylan Hackworth
Dylan Hackworth

SQLite for Production

Build with and integrate SQLite into your production applications.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Turso

Summary

In the video, I compared two key-value tables, one with a row ID and one without, to see if there's a performance difference when running benchmarks. I found that there's almost no performance difference between the two on my machine, although the table without a row ID is slightly smaller in size. My advice is not to focus on row ID optimizations early in development and instead prioritize using indexes for improvements.

Video Transcript

We're gonna benchmark 2 different key value tables. 1 has a row ID and 1 is without row ID. Otherwise, these tables are identical. This benchmark is helpful for comparing these two things to each other, but like every benchmark, it's fake. It's fabricated.

I tried to make it as real as possible, but this is this is as good as we can get with an artificial benchmark. So benchmarks are all wrong, but some of them are useful and I hope this one is useful. Let's take a look at the methodology. So don't worry about any of this stuff. This all just Laravel's database creator here.

So we've got 2 different tables, kvone, kv 2. We're gonna set wall mode, to on for both of those. And then these statements here create the table, and they're pretty create table kv with key as a text, primary key, value as an any. Both tables are strict, but you'll see that kv2 is without row id and kv1 is not without row id which means it is with row ID. Let's take a look at the benchmarking code.

I tried to make this as close to production as possible, but it's not actually production. So here's here's what we're gonna do. This is the command bench. We can pass in the number of processes. We'll probably keep that at 25.

We'll probably bump up the number of queries to 50,000 instead of 15,000. And then what we do is let's hop down to the child down here. So the first thing we do is we figure out if we're gonna do a read or a write. We're only gonna write 5% of the time and we're gonna read 95% of the time. That seems like a pretty good mix to me.

The key is 8 characters, 8 random characters. The value is 64 random characters. Now I have seeded these random functions so we will get reproducible random values. They're still fully random. They just all are gonna be the same every time we run them.

So that what that allows us to do is test with k v one table and then test test with k v two table. And the values that are being used for these tests are gonna be the same, but I don't have to generate a 1000000 random strings up front myself. All I did was seed the random number generator. Then the only thing I wanna measure are these statements. If it's a write, we're gonna do this statement.

And if it's a read, we're gonna do this statement. I don't wanna accidentally measure this over head here. Not that it would be that much, but I don't wanna do it. So for a write, what we're gonna do let's see if I can get this on one line. There we go.

Insert into kv, k key value. We're gonna bind them in. There they are. On conflict key, do update set value, excluded dot value. So this is saying if we're inserting a key that already exists, just update the value instead.

This is an upsert. Pretty cool that SQL Lite does this. I do like that. If it's a read, select star from kv, where key equals key. That's it.

And then we put out w or r for read or write, and then write all the timings to a JSON file, and then we calculate the percentage, etcetera. So that is the methodology. Hopefully it is close enough to real world as to be useful, but it's at least close enough to each other for a relative comparison. So, let's do some comparisons. I've got them up side by side here, but the first one we're gonna do is the kvone database and we're gonna run 50 1,000 queries per process and I think we're gonna go ahead and run 25 processes which is the default.

So, let's run this and see what happens. So it's about to start and we're off to the races. So you can see each one is trying to get up to 50,000 and there we go. So, the d b was kvone and as, as a reminder, 1 is row id, 2 is no row ID. So this is the row ID, table.

25 processes, 50,000. So we had what does that come out to? A 1,000,000 a quarter? Queries. And here's the p 50 through 99.

Honestly, it's I mean, without row ID or or a row ID table is plenty fast. The p 99 is at point 89 milliseconds. So let's, let's run this again. So we'll say arbenchdb equals, kv2andqueries equals 51, 23, 50,000. So this is the one with no row ID.

And so, I think what we're about to discover is that it doesn't make a difference on this machine. This is running on a m one MacBook. So you'll see here, if we look at the p 90 or the p 95 is slightly faster. P 90 p 90 is slightly faster. So the reads are slightly faster except for the p 99.

We got one errant read out here. And, weirdly, the writes are slightly faster. And this is so this is so slight to me that it's, noise. It doesn't it's not real. This is too close to be real.

I mean, we're talking about, 0 less than 1 milliseconds difference. It's just it's just not real. There's no difference. And so it's it's interesting here that the without row ID table is not that much faster. I think one thing that we can see if we change into database and, list it out, we should see There we go.

Kv2 is smaller than kv1. So the the table that has no row ID is in fact smaller, that database is. So let's open this up. I don't remember the commands to make it pretty, so we're gonna look at 3 megabytes versus 3.4 megabytes. Because it is maintaining 2 b trees, so the the table that has the row ID is maintaining 2 b trees and the table without the row ID is only maintaining 1 b tree, that means there's less data in this second database.

So my, my suggestion to you is that you don't worry about without row ID as an optimization until until you're near the end. Once you're near the end of development and you look around knowing everything you know, you might run some tests on your data to see if there's a without row ID optimization that could be made. But I wouldn't worry about it too much and it's definitely not the first thing I would reach for. The first thing I would reach for is indexes.