The exercises for this chapter use a different database than the rest of the chapter. The movies database has two tables, movies and actors. We will be using the movie table (so queries will look like SELECT * from movies). It features the following columns of data:
Table30.6.1.
Column Name
Description
id
A unique number for each record
imdb_id
The id of the movie in the Internet Movie DataBase (IMDB)
title
Title fo the movie
director
Name of the director
year
Year (number) the movie was released
rating
Rating (R, PG, etc…)
genres
Comma-separated list of genres
runtime
Length in minutes
country
Comma separated list of countries it was released in
language
Comma separated list of languages it was released in
imdb_score
Score of the movie (1-10) in IMDB
imdb_votes
Number of ratings for the movie in the IMDB
metacritic_score
Score of the movie (1-100) on the Metacritic website
Checkpoint30.6.2.
Write a query to find the average IMDB score for all the movies.
Checkpoint30.6.3.
Write a query to find the highest IMDB score for any movie that was directed at least partially by Quentin Tarantino. (Hint: A movie can have multiple directors, so you will need to use director LIKE '%Quentin Tarantino%' to identify movies where Quentin Tarantino is in the list of directors.)
Checkpoint30.6.4.
Write a query to display the total number of movies with a rating of 'PG-13' in the database.
Checkpoint30.6.5.
Use GROUP BY to write a query to display each rating category with the number of movies in that category. Your output should look like this:
...
PG | 43
PG-13 | 32
...
Checkpoint30.6.6.
Write a query to list each director with the highest (MAX) imdb_score that director’s movies have earned. Order the output so the directors with the highest imdb_score come first.
Hint: You will need to use both GROUP BY and ORDER BY`.
Your output should look like this:
director | score
Frank Darabont | 9.3
Francis Ford Coppola | 9.2
...
Checkpoint30.6.7.
Write a query to list each director with the COUNT of the number of films they have made. But only retrieve the data for directors with at least 2 films.
The results should be sorted by director’s name (but that should be the default, you should not have to use ORDER BY).
Hint: Remember that you have to use HAVING to filter grouped results.
Your output should look like this:
director | num_films
Akira Kurosawa | 5
Alfred Hitchcock | 6
...
Checkpoint30.6.8.
Write a query to list each director with the number of movies they have made that have an imdb_score of greater than 8.5.
Sort the results so the directors with the greatest number of highly rated files are first.
Hint: Here you want to filter the movies by IMDB rating before they are aggregated, so you need a WHERE not a HAVING .
Your output should look like this:
director | high_rated_films
Peter Jackson | 3
Christopher Nolan | 3
Steven Spielberg | 2
...
The following problems will also make use of the actors table. It consists of records that match a movie to a particular actor. If an actor appears in multiple movies, they will appear in one record for each movie they appear in.
Table30.6.9.
Column Name
Description
id
A unique number for each record
movie_id
The id of the movie the actor appears in. This will match the id field in the movies table.
imdb_id
ID of the actor on the IMDB website
name
Name of the actor
Checkpoint30.6.10.
We would like to know the title of all the movies Marlon Brando acted in.
Write a query that gets just the movies.title from the results of joining the actors table with the movies table (so that actors.movie_id matches with movies.id) and selecting rows in which the actors.name is "Marlon Brando".
Your output should look like this:
title
Apocalypse Now
On the Waterfront
The Godfather
Checkpoint30.6.11.
Write a query that displays each actor.name with the highest metacritic_score of any movie that actor has been in (we won’t know what movie the score is from).
You will have to join actors and movies so that you have access to all the needed data. You will also have to use GROUP BY. The GROUP BY should come after the JOIN.
Your output should look like this:
name | max_metacritic
A.B. Lane | 0
A.J. O'Connor | 0
A.K. Hangal | 84
A.R. Haysel | 87
A.S. Duggal | 67
...