Section 29.4 Filtering
Filtering data is the process of restricting a query to a subset of rows based on some column condition. Maybe we only want to consider trips that start from a particular location. Or rides that have a certain duration.
To filter data with SQL, we use the WHERE
keyword. You can append WHERE column_condition
to any SQL query, and the result will be filtered only to rows that satisfy the column condition. For example, you might want to look only at bike trips that are at least one hour (3600 seconds):
Try running this and examining the duration column… the records that are returned by the SELECT
should now all be ones with a duration of at least 3600.
When writing WHERE
clauses, we use the same kinds of comparison operators that we do in Python: ==
, !=
, >
, >=
, … Just like in Python, we write numbers without commas like 3600
. Also like in Python, if we want to write something that is a string, we have to put quotes around it (either single '
or double "
quotes work). To select the records where member_type
was Member we would write:
...
WHERE
member_type == "Member"
--Use quotes around Member because it is a string
Finally, notice that the WHERE
clause is part of the query as the ;
does not appear until after the WHERE. If there was a ;
at the end of the FROM trip_data
, the statement would end there and the WHERE
would not be considered to be part of the rest of the query. Try putting a ;
after trip_data
and running the program. In the modified program, the SELECT
is still valid and returns the 3 desired columns, but does it for trips of all durations. Then, SQL tries to interpret WHERE duration >= 3600
on its own, which makes no sense (what does it apply to???) and we get an error about the WHERE
.
It is also possible to filter by multiple criteria by using AND
or OR
. For example, to look at bike trips that are 60 minutes or more and the member_type
is MEMBER
, the query would be as below.
Checkpoint 29.4.2.
Write a query to find the ending station and duration (in that order) of all trips by bike number 'W00153'
that lasted over 8 hours (remember that duration is measured in seconds, you will have to figure out how many seconds are in 8 hours).
The bike “number” - ’W00153’ - is not really a number, it is a string, so make sure to use quotes around it in your query.
You have attempted
of
activities on this page.