9.2. Filter¶
Filter focuses on one row at a time, and, if the data in that row satisfies an input condition, puts that row into the result relation, in other words “passes it.”
Filter’s result relation has these notable name and structure characteristics:
Name characteristics:
It has the same base as the input relation.
It has a row modifier that comes from the “condition” input. It has no column modifier.
Structure characteristics:
It has the same identifier as the input relation.
It is exactly as wide as the input relation.
Its height depends on the data. Filter often yields some of the rows, but may yield all or none of them.
9.2.1. Examples¶
With each of the examples for the various operators in this book, we will present some of the relevant data and then an English query and its corresponding precedence chart. We will often see how we translate the query into SQL.
Let’s start with the creature relation that we saw earlier when we introduced the database that we are using.
creatureId |
creatureName |
creatureType |
reside_townId |
idol_creatureId |
---|---|---|---|---|
1 |
Bannon |
person |
p |
10 |
2 |
Myers |
person |
a |
9 |
3 |
Neff |
person |
be |
NULL |
4 |
Neff |
person |
b |
3 |
5 |
Mieska |
person |
d |
10 |
6 |
Carlis |
person |
p |
9 |
7 |
Kermit |
frog |
g |
8 |
8 |
Godzilla |
monster |
t |
6 |
9 |
Thor |
superhero |
as |
NULL |
10 |
Elastigirl |
superhero |
mv |
13 |
11 |
David Beckham |
person |
le |
9 |
12 |
Harry Kane |
person |
le |
11 |
13 |
Megan Rapinoe |
person |
sw |
10 |
Note that creatureId is the identifier of Creature (the column name of the relation is italicized).
9.2.1.1. A simple condition¶
English Query:
Find each person creature.
A more precise wording for this relation would be:
Find each creature whose creatureType is ‘person’.
The corresponding chart for this looks like this:
When the condition is on a column like a type, as in this example, we can often rename the result, so the chart looks like the original English query that we posed:
Get used to making sure the result name is accurate before you rename and shorten it. In this case it was easy and made sense, in other cases it may be more precise and understandable to leave it in the long form.
Important
Notice that the result relation’s base does not change when using the filter operator. In this case, each resulting instance is still a creature, identified by its creatureId.
Now let’s see what the SQL will look like for this. You can look at the second tab to see how the data was created, but you don’t have to. The main focus is the query that matches the chart versions above. An explanation of the SQL code follows this example.
Explanation of the SQL query (first tab):
The keyword SELECT indicates that a set of columns will appear next. In the case of filter, we are retrieving all of the original columns of the input relation. Since this is done fairly often, SQL has a shortcut for this, which is the asterisk symbol, *. The keyword FROM indicates the input relation will come next, which in this case is named creature in this database. The keyword WHERE is followed by the condition that we want to apply. Here we want to return fewer creatures in the result relation, only those that match the conditional phrase found after the where clause. As we have seen all SQL commands end with a semi-colon ( ; ).
9.2.1.2. Another simple condition¶
English Query:
Find each non-person creature.
Corresponding chart:
Corresponding SQL:
The conditions you can use go beyond equality and inequality. For column values that are numbers, the following symbols also serve as relational operators:
greater than: >
greater than or equal to: >=
less than: <
less than or equal to: <=
9.2.1.3. Combining conditions¶
The conditions can be combined in various ways, using 3 logic words that combine them together: and, or, and not.
English Query:
Find each person creature who resides in London.
And more precisely using the creature relation:
Find each creature whose creatureType is ‘person’ and whose reside_townId is ‘le’.
The first version of the query is like how we might say it in English speech with someone, but writing the second re-written version for precision is very helpful for ensuring that you get the precedence chart correct.
Corresponding chart:
Here we show the shortened English description for the result relation.
Corresponding SQL:
Let’s try this on another relation, Achievement:
achId |
creatureId |
skillCode |
proficiency |
achDate |
test_townId |
---|---|---|---|---|---|
1 |
1 |
A |
3 |
2020-07-24 21:37:53 |
a |
2 |
1 |
E |
3 |
2017-09-15 15:35:00 |
d |
3 |
1 |
A |
3 |
2018-07-14 14:00:00 |
a |
4 |
1 |
E |
3 |
2020-07-24 21:37:53 |
d |
5 |
5 |
Z |
6 |
2016-04-12 15:42:30 |
t |
6 |
3 |
Z |
4 |
2018-07-15 00:00:00 |
be |
7 |
3 |
Z |
4 |
2018-07-15 00:00:00 |
be |
8 |
3 |
Z |
4 |
2018-07-15 00:00:00 |
be |
9 |
4 |
Z |
3 |
2018-06-10 00:00:00 |
a |
10 |
11 |
PK |
10 |
1998-08-15 00:00:00 |
le |
11 |
12 |
PK |
10 |
2016-05-24 00:00:00 |
le |
12 |
13 |
PK |
10 |
2012-08-06 00:00:00 |
le |
13 |
8 |
PK |
1 |
NULL |
t |
14 |
9 |
THR |
10 |
2018-08-12 14:30:00 |
mv |
15 |
10 |
THR |
10 |
2018-08-12 14:30:00 |
mv |
16 |
7 |
B2 |
19 |
2017-01-10 16:30:00 |
d |
17 |
9 |
B2 |
19 |
2017-01-10 16:30:00 |
d |
18 |
4 |
TR4 |
85 |
2012-07-30 00:00:00 |
le |
19 |
5 |
TR4 |
85 |
2012-07-30 00:00:00 |
le |
20 |
2 |
TR4 |
85 |
2012-07-30 00:00:00 |
le |
21 |
1 |
TR4 |
85 |
2012-07-30 00:00:00 |
le |
22 |
9 |
D3 |
8 |
2020-07-24 16:37:53 |
sw |
23 |
13 |
D3 |
8 |
2020-07-24 16:37:53 |
sw |
24 |
7 |
D3 |
8 |
2020-07-24 16:37:53 |
sw |
English Query:
Find each achievement whose proficiency is greater than 3 and that was achieved prior to August 1, 2018.
Corresponding chart:
Corresponding SQL:
9.2.2. The Scope and Nature of Filter¶
Important
The scope of filter is one row. The filter condition is applied to each row, one at a time, when deciding if it qualifies for the result relation.
The conditions in the previous query were applied to each row instance, so that only those instances where both conditions applied were kept, because of the boolean connector, AND. Keep this important idea in mind- it will show up in some more sophisticated queries later.
Important
Filter is the only primitive operator that can perform comparisons that are other than exactly matching ones.
This crucial fact has significant consequences. Every other primitive operator in this chapter and the next exactly matches values or does no matching. Later we will consider other sophisticated operators: set operators and match join operators, which only use exact matching (equality comparison) in their conditions. We do not return to non-matching operators until we look at the advanced operator called compare join in a much later chapter.
9.2.3. Date datatype in DBMS¶
The examples in this book that create and manipulate date data values work for SQLite, but will be different for other database systems. You will need to change the SQL examples when trying these in another database. It is good practice to try this, so you are aware of the differences.
For SQLite, one important detail to note is that the month and day portions of the date in singe quotes must begin with a 0 if they are less than 10. The overall format is ‘YYYY-MM-DD HH:MM:SS’.
The great advantage of using precedence charts to plan your query first is that these will look the same regardless of the database: you can use whatever phrase suits what you want to find in the condition in the operator oval. You can get the query correct first before worrying about the SQL implementation details.
9.2.4. A word about phrasing¶
The result relation name, like the input name, is singular in this book. The English queries also have a particular style, which begins “Find each … “. This is purposeful and you should follow this pattern. When using this format, we are making it clear that the result relation contains zero or more instances, each of which meet the conditions we are looking for. The result relation is a set of individual things, one row per thing.
9.2.5. Exercises¶
Try creating the precedence charts for these queries. Use a drawing tool such as diagrams.net. If you haven’t yet, try making a copy of this drawio unary operator template in your browser. To do this, you will need to access this Google Drive file and make a copy in your own Google Drive space. You can use it in a free web-based drawing tool called diagrams.net.
Find each achievement whose proficiency is greater than 3 and not equal to 6.
Find each achievement whose skillCode is ‘Z’.
Find each superhero creature.
Find each skill originating in a town whose townId is ‘d’. (Hint: skill relation has a column called origin_townId, which is a foreign key into the town relation. You do not need town, only skill, to write the chart).
Devise one of your own queries in English and draw the chart.