High Performance
SQLite
Introduction
Introduction to this course
Introduction to SQLite
SQLite is safe
It's just a file
Running SQLite locally
Good uses for SQLite
SQLite's limitations
The documentation
SQLite Internals
SQLite's structure
SQLite's file format
Dot commands
Pragmas
Virtual tables
Schema
Flexible types
Types
Strict types
Dates
Booleans
Floating point
Rowid tables
Auto increments
Without rowid
Generated columns
Optimizing SQLite
Locking
Rollback mode
WAL mode
WAL vs Journal benchmarks
Busy timeout
Transaction modes
Vacuum
Analyze & Optimize
Suggested pragmas
Faster inserts
Indexes
Introduction to indexes
B+ trees
Primary, secondary, and clustered indexes
Without rowid tables
Benchmarking without rowid tables
Imposter tables
Primary key data types
Where to add indexes
Index selectivity
Composite indexes
Composite ordering
Covering indexes
Partial indexes
Indexes on expressions
Automatic indexes
Duplicate indexes
Indexing joins
Advanced SQL
Explain
Index obfuscation
Joins
Subqueries
Unions
CTEs
Recursive CTEs
Window functions
Dealing with NULLs
Row value syntax
Indexed sorting
Upserts
Returning
Aggregates
Triggers
Operating SQLite
Altering schema
Altering schema with tools
Multi-database
Multi-tenancy
Backups
Exports
How to corrupt SQLite
Advanced JSON
Intro to JSON
JSON vs. JSONB
Argument types: path, value, JSON
JSON5
Valid JSON
Creating JSON objects + arrays
JSON Extraction
Updating JSON
JSON aggregates
JSON table functions
Indexing JSON
Full Text Search
Creating FTS tables
Performing searches
Ranking results
Highlighting matches
Bonus Interviews
DHH discusses SQLite in Ruby on Rails
Distributed SQLite with Litestream and LiteFS
Offline-first, multiplayer SQLite
Production SQLite with Turso and libSQL
Migrating from Postgres to SQLite with Kent C. Dodds
Ruby on Rails with SQLite
Moving from Redis to SQLite with Mike Buckbee
Locked video

Please purchase the course to watch this video.

Video thumbnail
Full Text Search
Ranking results

Full Course

$
179
$229
USD, one-time fee
Thank you for the excellent High Performance SQLite course! It’s been incredibly useful in deepening my understanding of SQLite. The sections on SQLite internals, schema, and optimization techniques are particularly insightful and have significantly improved my ability to work with SQLite in production environments.
Ali Kasmani
Ali Kasmani

SQLite for Production

Build with and integrate SQLite into your production applications.

Test Data

I've made my test data available for you to use and follow along.

Download
or
Use on Turso

Summary

In the video, I show you how to improve search results by ranking them for relevancy in two ways. First, I explain how to use a rank column from the full text search table to order results, which works well for basic searches. Then, I introduce the bm25 function for assigning different weights to columns, helping you fine-tune searches by prioritizing certain fields like the title over others like transcripts.

Video Transcript

(keyboard keys clacking) So far we have created, populated, kept up to date and searched a full text search table. But there are a few more things I want to show you, because I want you to be able to implement this in your applications. I want to teach you practical things. So, if we were to run this as we have been running it, we would be returning results to the users that are correct, which is always a good start. We want to be correct first, but it's not entirely relevant yet. So I'm gonna show you how you can rank the results by relevancy in two different ways. We're back to this query here, which we join in the videos table 'cause this contains the actual data that we're looking for and we're pretending that the user has typed on the website indexes. And so if we were to run that, you see, I mean, yeah, but look at this. SQLite's structure comes first and composite indexes comes ninth. That's not very good. So what we can do is let's go ahead and just add, let's just select title to narrow it down a little bit. So we're gonna select videos.title, and then there's this secret other thing, the secret other column that contains the rank. So if we look at the rank, you see it's this big floating point number and the smaller the number is, the more relevant it is to the user's search. So you can see that this negative 1.0 is at the top, but in fact negative 1.6, that one that we looked at earlier is frankly a much better match. So we can simply say order by rank. And this is provided by the full text search table. We didn't create it, it's just there. So now if we do that again and we run it, boy, that if I'm a user and I searched indexes, introduction to indexes is in fact what I would be looking for. This is very good. This is a great start. And if you stopped here, I would be happy. I'd be totally fine with that. In this case, however, there are cases where this might not be exactly what you're looking for. So imagine you've got 10 columns. So instead of just title and transcript, you've got title, transcript, comments, slug, maybe you're searching across the URL slug as well. And you want to like, you want to give different weights to different columns because the word index showing up in a title means a lot more than the word index showing up in someone's comment, right? So we have another option here. We have a a function. So you have to do open, we have a function here called bm25. And if you run that, it's not gonna work. And so what you have to do is you have to pass in the full text search table. And now we've got BM 25 and it didn't do anything. Look, they're all identical. They're all the exact same. So that's not entirely helpful. Where it gets helpful is you can pass through different weights. So remember when we declared this full text search table, we did, we did like create virtual FTS over title transcript using FTS, whatever, right? We did something like this. The important part is right here, title, transcript. So when you're using the bm25 function, you can follow this order and give different weights to different columns. In this exact example, it is more valuable if the word shows up in the title than if it shows up in the transcript. And so here we're saying, hey, we are, we wanna assign 10X the weight to the title versus 1X to the transcript. Now, we could do, if we had multiple columns, we could say five for the transcript and one which is the default for any column after that, ten five when we only have two columns. Doesn't make a lot of sense because you're saying this is worth 10 and this is worth five, which reduces down to this is worth two and this is worth one. So when we only have two columns, let's just say that a title hit is worth five X a transcript hit. And now if we were to run that again, you'll see it does, honestly, it does slightly change a little bit. I don't know if it changes the order a little bit, which is a good indicator that we had it right the first time. If we wanna overweight it and say a title hit is worth 15 x, that is what we would get there. And then of course we're just looking at it. But you could plop that right down there and say rank it by that. Now this looks pretty good to me, introduction to indexes, automatic indexes. So we're getting all of the ones with index in the title. Here's one which must have a bunch of mentions of index in the transcript to beat out these down here. So this is, this is an interesting case and I wonder if, so these up here are a strict match on indexes, so I bet those get more points where this one down here is a singular index. And so I wonder if we changed it to index. Yeah, it changed it. It changed it up quite a bit. So now we have covering, obfuscation, duplicate, composite, and then it looks like altering schema. I don't even see it on here. So that that changed it up. That changed it up quite a bit. So bm25 is a good way to fine tune the ranking algorithm based on your specific data. And so what I'm not telling you is that bm25 is better than rank. I don't know. I don't know your use case. I don't know your data. I am telling you bm25 exists and might be incredibly helpful for your application.