High Performance
SQLite
Introduction
Introduction to this course
Introduction to SQLite
SQLite is safe
It's just a file
Running SQLite locally
Good uses for SQLite
SQLite's limitations
The documentation
SQLite Internals
SQLite's structure
SQLite's file format
Dot commands
Pragmas
Virtual tables
Schema
Flexible types
Types
Strict types
Dates
Booleans
Floating point
Rowid tables
Auto increments
Without rowid
Generated columns
Optimizing SQLite
Locking
Rollback mode
WAL mode
WAL vs Journal benchmarks
Busy timeout
Transaction modes
Vacuum
Analyze & Optimize
Suggested pragmas
Faster inserts
Indexes
Introduction to indexes
B+ trees
Primary, secondary, and clustered indexes
Without rowid tables
Benchmarking without rowid tables
Imposter tables
Primary key data types
Where to add indexes
Index selectivity
Composite indexes
Composite ordering
Covering indexes
Partial indexes
Indexes on expressions
Automatic indexes
Duplicate indexes
Indexing joins
Advanced SQL
Explain
Index obfuscation
Joins
Subqueries
Unions
CTEs
Recursive CTEs
Window functions
Dealing with NULLs
Row value syntax
Indexed sorting
Upserts
Returning
Aggregates
Triggers
Operating SQLite
Altering schema
Altering schema with tools
Multi-database
Multi-tenancy
Backups
Exports
How to corrupt SQLite
Advanced JSON
Intro to JSON
JSON vs. JSONB
Argument types: path, value, JSON
JSON5
Valid JSON
Creating JSON objects + arrays
JSON Extraction
Updating JSON
JSON aggregates
JSON table functions
Indexing JSON
Full Text Search
Creating FTS tables
Performing searches
Ranking results
Highlighting matches
Bonus Interviews
DHH discusses SQLite in Ruby on Rails
Distributed SQLite with Litestream and LiteFS
Offline-first, multiplayer SQLite
Production SQLite with Turso and libSQL
Migrating from Postgres to SQLite with Kent C. Dodds
Ruby on Rails with SQLite
Moving from Redis to SQLite with Mike Buckbee
Locked video

Please purchase the course to watch this video.

Video thumbnail
Indexes
Duplicate indexes

Full Course

$
179
$229
USD, one-time fee
I highly recommend High-Performance SQLite to fellow engineers. My motivation for learning SQLite stemmed from my interest in local-first applications, and I needed to understand SQLite to effectively use it in the browser. This resource provided that knowledge in a clear and practical manner, making it a must-watch for modern developers.
Yuya Fujimoto
Yuya Fujimoto

SQLite for Production

Build with and integrate SQLite into your production applications.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Turso

Summary

In the video, I explain how database indexes work, emphasizing that they read from left to right and stop at the first range condition. I show that multiple indexes sharing the same left prefix can be redundant, and you can often delete the simpler one to save space. However, if an invisible ID affects your queries, you must be careful not to remove the wrong index, as it could slow down data retrieval.

Video Transcript

Hopefully, by now, you have seared into your brain left to right, no skipping, stops at the first range. That is how an index is used. It starts at the left. It moves to the right. It stops when it encounters the first range condition.

This also means 1 other thing in that indexes that share a common left prefix, stay with me, indexes that share a common left prefix are duplicative. Woah. What does any of that mean? Let's look at the code. Let's take a look at what we have so far.

And we've got nada, which is what we're after. And as a reminder, we've got bookmarks, migrations, and users. Migrations doesn't matter. We've got users and bookmarks. Okeydoke.

Let's say we need an index on, an index on email. That seems common enough. Let's do create index email email on user's email. So we do that and then we're we take a look and we're like, great. We've got our index.

Actually, you know what? We do a lot of querying by email and is pro. So if we can remember, select star from users, limit 2 is fine. That box mode is too bad. Is pro is there.

So let's create another index and we'll say create create index email is pro on users email is pro. So we create that index and now we have 2 indexes. Let's pull this guy back up. We have 2 indexes that have email as a leftmost column. They share a left most prefix.

That means that this index right here is a duplicate of this index even though it has an additional column in there. So let's clear that line and we'll say select star from users where email email equals aaron.frantis at example.com. So it says search users using index email. So it's using that other index, which that's fine. We would probably expect that.

But if we were to drop index email and we were to run this again, it just uses email is pro. So it just uses the other index. So the the takeaway here is if you look at your indexes and indexes will accumulate over time, right, as business needs change or as your data schema changes, indexes just kind of pile up. And so go dig through your indexes and look to see, do any of these indexes share a leftmost prefix? So, usually what you'll find is that you have an index on a single column like email and then an index that is a composite index that has email as the left column.

That does you no good. All that does is take up space, potentially confuse the query planner, Although, the query planner is pretty sophisticated but it is worthless. The only tiny tiny tiny caveat is that the index on email is actually an index on email row ID because remember every secondary index contains a pointer to get back to the main table. And in our case, id is an alias of row id. So when we create an index on email, actually we're creating it on email comma ID.

And when we create it on email is pro, we're actually creating it on email is pro ID. Why would this ever matter? Left to right, no skipping stops at the first range, so you can imagine that this might matter. This is the the outside outside use case where, 2 indexes are not duplicative of each other. So I'm gonna error that out and we'll say select star from users where email equals [email protected].

That's gonna use our email is pro index just fine. But if we threw an order by ID descending on there, now is pro is in the way, right? So is pro is blocking our access to this invisible ID that is automatically added at the end. So if we ran that, you'll see use tempbtree for order by. So it used the first part of the index, got to the second part, and said, ah shoot, that is not in my query so I can't use it, which means I don't have access to the third part of the index for sorting.

So let's do, create index on users or create index email on users email. Let's create that 1 again. And now if we were to do the same thing where email equals whatever order by id, we'll see it's using that other key, that other index email and the sort by temp b tree thing, that has gone away. Because it was able it was able to use the first part of that index for the strict equality lookup and the second part of that index for ordering. So what does this what does this all mean?

This means that any 2 indexes that share a left most prefix, so it has the same columns in the leftmost side of the index, those are duplicates of each other, and you can delete the 1 with fewer columns because the 1 with more columns in the index encompasses that 1 with fewer columns. The only caveat is if you are relying on that secret ID or row ID being there, then you probably can't delete that duplicate index. Most people are not relying most people don't even know that that second like that secret ID is in there and can be used. And so most people aren't relying on it. But if you are relying on it and you delete it, then your ordering may become quite a bit slower and I don't want you blaming me for that.