Video Transcript
(keyboard clicking) I'm gonna show you how to search against that full text search table that we just spent all that time and energy creating. I will warn you, there are a lot of ways you can search. There are so many search operators, groups, ways to do column searches, negate columns, bullions, all kinds of stuff. I can't show you everything. I've written down maybe 15 or 20 examples here, and we're just gonna kind of blow through them. If you find that this doesn't meet your search needs, I would first go check out the docs and see if there's a way to meet your search needs before you pull for something else. So, I'm gonna show you just a super basic one that pulls back all the data, and then we'll look at some specific stuff. So, here I have a basic select, we'll walk through it from the very top. Select videos.* from videos_fts, JOIN videos. So, this is what it looks like to pull back the content table using the full text search table. So, the data that we want is actually in videos. That contains the slug, the, in production, it contains the video URL, it contains whether it's free or paid. It contains all that information. And our video_fts table only contains title and transcript. So, we're gonna join them together, which is why we declared what that rowid was, such that they can be joined together. Okay, so we only wanna bring back videos.*, but we're gonna start with videos_fts, and then join in on the rowid equals the ID. And then we're just gonna say where the video_fts matches the word "indexes." Now, this matches across every single column in that FTS table. You can add a column to the FTS table that is unindexed. For a content table like this, that doesn't make a ton of sense, but there are other scenarios where you can add extra columns to the FTS table that aren't indexed. We're not gonna go over that, I don't find that to be terribly compelling. So, we're gonna do video_fts match indexes, and if we pull that, you'll see. Great, this looks pretty good. Introduction to indexes, duplicate atomic. Nope, that says automatic, where to add, partial, primary. We got 23 videos that include indexes. So, this is how you would, from your application side, potentially execute a search query. When a user comes to the page and they say, I wanna see stuff about indexes, you might plop their search term right in there, use an ORM or escape it, don't be silly. But you might plop their search term right in there. And then this could be the exact query that you run. And this might be good enough. We will talk about rank in a future video. Okay, so with that being said, we're gonna change gears just a little bit, and we're gonna not pull back the videos, we're just gonna strictly look at searching. So, we're just gonna select from video_fts, because this makes the queries easier to parse. And now that you know how to bring all the data back, I just wanna show you cool ways to search. So, the first one up is an exact match. And so, anytime you see select from the table where the table, that means all columns. Let's just look over, let's look over everything and title transcript, if there was other stuff there, let's find it. So, we can pull this and say, great, "one of the great things." So, somewhere at the beginning of this one, but somewhere in these two transcripts, I say "one of the great." So, there are at least three great things about SQLite. And you'll see it doesn't appear in the title, in fact, if we were to change this from the entire table to just the title, no luck, that's too bad. But usually you wanna search over all of the text content that you have declared as searchable. So here, we put that inside of double quotes. Similarly, you can use this operator, the plus, to say these things must follow each other. And you can say "one of the great" there. So, that is an exact phrase match. So, if you were to read the docs or articles or anything, you would see phrase, this is a phrase match. So, scrolling down a little bit, you can also have a prefix match. So, we can say, in fact, I want "one of the g," and then anything after that is fine. That would include good, great, green, gray. It would include one of the G, anything. And if we look here, you'll see, "one of the good" and "one of the great." So, this is a nice way to say, hey, I want to include a phrase, but then the next token must start with a G. But I don't care what actually comes after the G. So, include the phrase and then the last token must start with the G. And you'll see how we did this was we put quotes around it and then the asterisk outside of the quotes, because now we're outside of the strict matching. Alright, let's keep going. You can use the caret say, I need this token to appear at the very, very beginning. So, if we run that, you'll see one, one, one, one. So, that is a way that you can control position of the token. So, if you need to ensure that something is at the beginning, you can use that caret. NEAR is pretty cool. This is, when I told you earlier that SQLites, full text search is surprisingly robust. This is some of the stuff that I had in mind. This is very cool to me. So, what we're saying here is, search across all the columns and great and SQLite must be within five tokens of each other. Tokens roughly translates to words. So, we can say "one of the great things about SQLite." Boom, got it. Now, if we were to change this to "one," that likely doesn't return anything because "great" and "SQLite" aren't right next to each other anywhere. So, we can... There, there we go. "Great things about SQLite." So, that's within two tokens of each other, and you can expand it as much as you want. If you don't pass anything, the default tokens are 10. I don't know if that is guaranteed to stay the same in the future. Probably doesn't super matter. So, that is NEAR. We've been searching across the entire, we've been searching across the entire table. You don't have to do that. You can say, where transcript matches strictly "one + of + the + great," and the title matches "SQLite." And that will bring back "SQLite is safe. One of the great things about SQLite." And so, if you want to search across everything, you just throw the table name in there. If you wanna search across column only, you can change it to just the column name. An alternate form of that is this colon syntax right here. You can say "videos_fts MATCH transcript : one + of + the + great," and you should see the exact, yeah, the exact same results come back for both. This is a little bit silly of an example, in my opinion. I wrote it so I'm allowed to say that. But when you see stuff like this, you can go on and create more complex filters. And typically you wouldn't have something like that, you would just do this version here. But once you enter into, let's search across the entire table, you can say transcript must include that, title must include that, let's leave this column out. And so, this is just as an example. If you were to literally write this query, I find this query to be more readable. Readability is always a function of preference and familiarity. So, whatever you find most readable, by all means write that code. Carrying on to the last one down here, you can negate. So, you can say, let's search across video_fts, but not in the title, and let's search for indexes. So, we get 23 rows back. But I don't want to search the title for the word "indexes." Now, what this does mean is, you might get back one that has that in the title. That is exactly what we asked it to do because you'll see, unfortunately, not all indexes are created equal. So, SQLite dutifully executed responsibility by searching the rest of the table, which in our case is just the transcript. It searched the rest of the table for the word "indexes." And then index was, look, it's in several of them. Indexes was in the title. So, if you're trying to exclude rows that contain the word "indexes" in the title, this is not how you do it. This is simply saying, hey, search just the transcript, in this case, search just the transcript. Or rather, don't search the title, but it's okay if it exists in the title. So, there are several more ways that you can search. Of course you can do parentheses with ands and ors. Parentheses with the plus. You can combine column restrictions, you can say, yes, this must be in that column and this must not be in that column, or don't search that column. There are a bunch of different ways to search. This, I find would probably cover most web or mobile applications where a user is initiating a search. Frankly, I find that this would cover most applications where a user is initiating a search. They give you a search query, you put the search query in. That's kind of it. Of course, that is not all use cases, and so you do have the rest of these where you can do all kinds of great stuff. I would again, encourage you to check out the full syntax for the docs if you are building a more robust search engine in SQLite. And if you're building a robust search engine in SQLite, let me know. It would be, I would love to see that, that would be very cool. It's very powerful and very, very doable. In the next video, we're gonna look at ranking these results. So, you get back the rows in the order that makes the most sense to the user.