Aaron has this fantastic way of breaking down complex topics to bits that are understandable. I finally understand how to tune my PRAGMA values instead of just copy/pasting from StackOverflow.Mathias Hansen
Build with and integrate SQLite into your production applications.
I've made my test data available for you to use and follow along.
One more thing that fits nicely into this indexing module is foreign keys. Now, importantly, foreign keys are different than foreign key constraints. So you have foreign keys, which is just a way of referencing other rows. A child can say, hey. I've got a parent in that table.
Or a parent could say, hey. I've got a child in that table. That's just the way that you reference other rows by passing IDs around. Those are foreign keys. Right?
Foreign key constraints is how you allow the database to enforce referential integrity. So not only can you have rows that reference other rows, you can say to the database that this row must reference a legitimate row in a different table and if not, I want you to do something. Throw an error, do something. That is a foreign key constraint which by default are disabled in SQLite, but there's an easy pragma to turn them on if that is what you want. Now, do you need foreign key constraints?
I don't know. I don't know, man. It depends on your application. I like to have the database enforce that referential integrity. That helps me feel good.
It helps me sleep well at night. Once you reach a certain scale, it might be too expensive to enforce referential integrity. That is true with other databases. I think, if I had to guess, that would be a hard scale to reach with SQLite. SQLite is not meant for multi data center, many, many hundreds of thousands of writes at a time.
And so I don't know if you can reach that scale with SQLite. May maybe you can. I haven't personally reached that scale and so I can't really speak to it. So I like enabling foreign key constraints. Reasonable minds differ on it.
But I'm gonna show you how to do it in SQLite and then I'll let you make that decision. I'm gonna go ahead and create a table called parent and we'll say that ID is an integer primary key. That's probably enough for that table. Create table child. Id is an integer.
Primary key. And parent id is an integer that ref integer that references parent ID. So this will set us up with, this will set us up with the foreign key constraint by saying, hey, Whenever you insert something into this column, I need you to go to the parent table and check the ID column to make sure that something exists over there with that given ID. So if we create that and then we check indexes, we see email and is pro. Those are the old ones.
So we've created the, the reference but we we've created the constraint but we haven't added any indexes yet. Some indexes are required and some are suggested when it comes to foreign keys. The required index is that the parent ID must be unique. Because you you can imagine if you have a child table that references a parent ID, there can't be there can't be more than one parent ID. Now, normally, the parent ID is an alias for the row ID or the primary key.
So in some regard or in some way, that, that ID is usually unique by its very nature. But you could have, you could have it reference a column that is not the primary key and that would have to be subject to a unique constraint. The index that is suggested is on the child side. It's good to have an index on parent ID. It makes everything a lot faster.
So if we were to insert into parent, we'll do values of 1 and 2 and 3. So we're just gonna insert 3 rows into there. Interestingly, it already scans the child. That's interesting. Select star from parent and we see scan the parent.
There we go. So we have values of 1, 2, 3. Insert into child, we're just gonna insert the parent ID with a value of 1 and 2 and 3. So far nothing interesting is happening which is what we want. So we've got 3 children that reference 3 parents.
Not that interesting. If we were to try this again and insert a 4, here's what's gonna happen. It says, whoops, I went over to the parent table and nobody knew you. Nobody knew you over there. So it tried to insert 4, it went the parent table and said, there is no 4.
You told me that we should have constraints on and so therefore I am going to fail. Now, if we were to run explain and then we can just copy, this is where we get into the realm of stuff that is really hard to read. So we're gonna insert a valid value into the child table with explain on. And holy moly. Oh my goodness.
Okay. Let's see if we can just figure out a little bit of what's going on here. 1st, it looks like it's opening for right. Who knows what all of that means? Opening the child table.
It does a new row ID. It makes a record. Something something something. Then look down here. Open for read the parent table.
So as the, as the child row is being inserted, it goes over to the parent row and looks for the integer key. I actually don't know what the integer key is, but it goes and looks over for the integer key and then if it's not found, it's going to halt. Otherwise, it's going to go to 14 child table. So this is a little bit over my head, frankly. But as it's inserting into the child table, it looks into the parent table and looks for that ID to make sure that it can insert that row into the child table.
Let's look at the indexes required on the child table really quickly. If we were to do select star from parent, left join child on parent dot id equals child dot parent id, That is going to work. You'll see again it's using that automatic index which is a sign for us that we have missed something. So if we turn expert on and then run this again, it says, create index on child parent ID. Yeah.
That makes enough sense. In fact, this is ready to go. We can just copy and paste that and if we run this again, you'll see great. Now it's using the covering index, the the one that it previously created automatically. Now now we have created it and so it persists and so this is going to be faster.
So it's using the covering index child idx yada yada yada. And that satisfies the needs of the query which means it's a covering index and that makes the join a lot faster because it doesn't have to scan the table multiple times which it won't do. Instead, it will construct a temporary b tree to prevent it from scanning the table multiple times. So, foreign keys, always on, impossible to disable foreign key constraints, must be turned on, and then you add that references when you're constructing the table. For your foreign key constraints, your parent key must be unique.
You can't have a child referencing a parent and there be multiple possible parents over there. Your parent ID in the parent table must be subject to a unique constraint. You can have a composite parent key, that's fine, but the parent key must be unique even if it is composite. For the child table, you likely want to put an index on that parent ID in the child table because that is what SQLite is gonna use to join those two things together. And instead of scanning the table multiple times, it's just gonna do the work of creating that index for you, so you might as well go ahead and do it so that it's persisted and not created at run time all the time.
We are going to go into more depth about joins in the querying section, but I just wanted to hit it very quickly because we are in the indexing section and this is pretty important for indexing to make sure that your joins perform the way that you're hoping.