High Performance SQLite is an excellent and immersive course, featuring real-time typed commands and valuable edge case explanations. I highly recommend it for enhancing your SQLite skills.Han-Hsun Liu
Build with and integrate SQLite into your production applications.
I've made my test data available for you to use and follow along.
Alright. In this module, we're gonna look a little bit at the internals of SQLite. Don't worry, we're not gonna go too deep and we're not just doing it because it's fun, although it is fun. We're doing it because this base level knowledge is going to serve as our foundation as we move forward and help us, talk about some of the higher level things. It's really tempting to just learn the higher level stuff without taking the time to learn just a little bit of what's going on under the hood.
But I think our higher learning, our higher level learning is going to be more effective if we understand a little bit of what's going on under the hood. So in this video we're gonna talk about the structure of SQLite, the library, then we'll look at the file format and some extensions. We'll look at some other stuff. The SQLite library has a very well defined structure. It starts on the very outside with the interface and then below that is this SQL command processor.
And one of the things that is most interesting about SQLite is they take your they take your SQL query and then they turn it into bytecode. So they have the tokenizer, the parser, and the code generator. All that works together to turn it into bytecode that is then executed by a virtual machine inside of SQLite. The entire virtual machine is one C file and it takes that byte and executes those instructions. Over here in the code generator lives a super important part of any database and that is the query planner.
So it's made up in in SQLite of a couple of different parts, but generally speaking, this is where the query planning goes on. And a query planner is necessary because when you issue a SQL statement to a database, you're telling it what you're looking for. You're not telling it how to find what you're looking for. So it's the database's responsibility to say, alright, well, there might be, 100 of thousands of ways to find the data that they're looking for. There might be millions of ways to find the data that they're looking for.
Which one is best? What's the best way? I'm gonna plan what I think is the best way and that's what this, query planner does over in the code generator area. All of the data in your SQLite database is stored as a b tree, which is, a data structure that we'll cover later, but your tables are a btree, your indexes are btrees, there are separate b trees for every table and every index. So this btree module figures out which data it needs and then it makes a request to the pager module and the pager module is what actually pulls the data out in fixed size chunks called pages, which is why they call it the pager module.
So the pager module will grab these chunks off of the disc and hand them back to the b tree module. Now the pager module is super important because it handles all the messy details of, writing data to the pages. And so it handles the atomic commits, it handles the rollbacks, it handles the locking. It handles all of the messy details. Below that is the final part and that is the OS interface.
The OS interface is responsible for, I think you can guess it, talking to the actual operating system. So SQLite has this concept, this abstract concept of a vfs and it's responsible for opening and reading and writing and closing the actual files. Currently there are 2 implementations, one for UNIX systems and one for Windows systems. And so that is how SQLite, maintains that cross platform compatibility, is they have separate VFS's for each system and you could write your own if you needed to. So that is the end of that is the end of the SQLite library itself all the way down to the operating system.
Depending on which system you're using is which vfs you'll be using, but that is responsible for that last part that could be different based on what system that you're on. The most important concepts from this, while it is all interesting, the most important concepts that will continue to come back up regardless of what database you're using, actually. So this is this is generally applicable knowledge, btrees and pages. Those are gonna continue to come up and continue to be very, very important. And so throughout the rest of the course, we're gonna dive deeper into those things.
So don't feel don't feel like you're behind if you don't know what those are. That's fine. We needed to introduce the concept at some point and now we have and now we can move on.