High Performance
SQLite
Introduction
Introduction to this course
Introduction to SQLite
SQLite is safe
It's just a file
Running SQLite locally
Good uses for SQLite
SQLite's limitations
The documentation
SQLite Internals
SQLite's structure
SQLite's file format
Dot commands
Pragmas
Virtual tables
Schema
Flexible types
Types
Strict types
Dates
Booleans
Floating point
Rowid tables
Auto increments
Without rowid
Generated columns
Optimizing SQLite
Locking
Rollback mode
WAL mode
WAL vs Journal benchmarks
Busy timeout
Transaction modes
Vacuum
Analyze & Optimize
Suggested pragmas
Faster inserts
Indexes
Introduction to indexes
B+ trees
Primary, secondary, and clustered indexes
Without rowid tables
Benchmarking without rowid tables
Imposter tables
Primary key data types
Where to add indexes
Index selectivity
Composite indexes
Composite ordering
Covering indexes
Partial indexes
Indexes on expressions
Automatic indexes
Duplicate indexes
Indexing joins
Advanced SQL
Explain
Index obfuscation
Joins
Subqueries
Unions
CTEs
Recursive CTEs
Window functions
Dealing with NULLs
Row value syntax
Indexed sorting
Upserts
Returning
Aggregates
Triggers
Operating SQLite
Altering schema
Altering schema with tools
Multi-database
Multi-tenancy
Backups
Exports
How to corrupt SQLite
Advanced JSON
Intro to JSON
JSON vs. JSONB
Argument types: path, value, JSON
JSON5
Valid JSON
Creating JSON objects + arrays
JSON Extraction
Updating JSON
JSON aggregates
JSON table functions
Indexing JSON
Full Text Search
Creating FTS tables
Performing searches
Ranking results
Highlighting matches
Bonus Interviews
DHH discusses SQLite in Ruby on Rails
Distributed SQLite with Litestream and LiteFS
Offline-first, multiplayer SQLite
Production SQLite with Turso and libSQL
Migrating from Postgres to SQLite with Kent C. Dodds
Ruby on Rails with SQLite
Moving from Redis to SQLite with Mike Buckbee
Watch for free

Enter your email below to watch this video

Video thumbnail
Optimizing SQLite
Vacuum

Full Course

$
179
$229
USD, one-time fee
The production quality of this course is so good and Aaron‘s enthusiasm and energy is infectious. I’m excited to master this extremely useful and powerful database technology that I had once written off as a toy for small projects. Now I know where SQLite shines and how I can use it effectively.
JD Lien
JD Lien

SQLite for Production

Build with and integrate SQLite into your production applications.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Turso

Summary

I talked about how SQLite databases do not automatically decrease in size when you delete data because the pages are still marked as reusable. To actually reduce the database size, you can use a process called "vacuuming" which compresses the database and eliminates free pages. Vacuuming can be important for saving space on limited storage devices and reducing fragmentation to improve performance.

Video Transcript

Remember back when we talked about the structure of SQLite? It is one file with a bunch of pages inside of it and you can calculate the size of the database based on the number of pages and the page size. So what happens if you delete basically everything in the database? Does the file size go down? Probably not.

Let's take a look. So in here we've got a bunch of databases that I have been working with. Let's do let's copy database dot SQLite and name it, we'll call it vacuum dot SQLite. A little hint about where we're going. So SQLite 3 vacuum SQLite.

So now we're in this vacuum database. We've got parent, or we've got bookmarks, child, migrations, parent, users. Just some tables that have been filling up. Again, you can find this data down below. You can spin it up on Terso pretty easily, but we'll just keep going here locally.

So if we were to run let's do pragma page count. This tells us how many pages are in the database and pragma page size will tell us the size of those pages. So if we were to do this number times 4096, we get 740,000. So if we do here, vac 740,000. So that we're calculating the size of our database here.

Let's get back into the database and do one other thing. We can do pragma free list, I think, free list count. And this tells us how many pages are on the free list. So when you delete a bunch of data, it doesn't actually all go away unless you use secure delete which is a thing. It doesn't actually all go away.

It just gets marked as, you can reuse this page later. This data, you can override it. It doesn't matter. So if we were to do select count star from users, There are a 1000000 users in there and if we look at bookmarks, there are 5,000,000 bookmarks in there. Which one do you wanna delete?

Let's, let's delete from bookmarks. So what could go wrong? Here we go. So select count from bookmarks. Wow.

That was really fast. And now if we check the free list count, you'll see, woah. We got a whole bunch of free pages there which means those pages are now marked as reusable. And if we exit, we'll see that the size of the database has not changed one bit. Literally literally one bit.

It hasn't changed at all. That is because those pages still exist. If we were to crack open this database and look, we would see all of that data is still there. And there are a few ways to get rid of it. Based on the name of the database, what do you think we're gonna do?

We are going to vacuum it up. So if we hop back in here to vacuum and clear this out, you can issue this command, vacuum. That's it. So this is going to take the database. It actually makes an entire copy of the database, and compresses it down so that all of those free pages are gone.

So if we were to check the free list count, we're down to 0 free pages. And if we were to exit, we would see that the size of the database has been reduced dramatically because all those pages all the populated pages were moved together so it reduced fragmentation and then the free list was moved to the end and then cut off. And so the database got a lot smaller. Now there are two potential reasons you might want to manually vacuum your database. The first is the one we just looked at which is size.

Now, in the world in which I live, which is the web application world, disk space is not really that big of a concern to me. So I'm never probably gonna vacuum just to reclaim a little bit of disk space. That doesn't concern me that much. You may live in a very different world. You may live in mobile devices or embedded devices or anything like that where disk space kind of is at a premium.

Or maybe you're working somehow in the browser and you need to you need to compress that database back down after a big delete. That would be a great use case for vacuuming. So it kinda depends on where you live and where your application lives. But for me, super not that important. The other reason you might vacuum is to reduce fragmentation.

So over time, as you're inserting, updating and deleting, you might have rows being put into those free list pages and so eventually the rows are kind of fragmented. And everything is gonna work faster if the data is all right next to other relevant data. And so there's unfortunately no way to, like, get an absolute read on this. When it comes to compressing it for size, you can compare the free list count to the page count and after it reaches a certain threshold you might say, goodness, 30% of my database is empty. I'm gonna go ahead and compress that.

That's great for the size argument, but for the fragmentation argument, there's there's no such stat that you can run. So you kinda just have to watch your queries. And if they start to get slow over time, you might consider looking into vacuuming. The problem with vacuuming is it can be slow and it requires an exclusive lock to, commit all of that data back into the database. It can also, in fact it does, require more space because it's making an entire copy of that database.

And so for a period of time, you're gonna have a lot more used space before it's all written, the compressed version is all written back in. So the two reasons would be make the database smaller and reduce fragmentation. It's kinda hand wavy when you wanna reduce fragmentation. There are auto vacuum modes. You can have a full auto vacuum or an incremental auto vacuum.

By default, it's off and I think that's probably just fine. I think it's probably just fine. I don't want to incur the penalty of continually vacuuming, every time a transaction or a connection is closed out. Then it's gotta acquire that lock. It's gotta vacuum.

It's gotta do all of that stuff. And that just I'm just not moved by that very much. But I do want you to know that this is something you can look out for. If you are doing a lot of maybe potentially thrashing to the table, inserting and deleting a lot of stuff, it might be a good idea to consider vacuuming.