Warning 30.2.1.
All columns that are not in the
GROUP BY clause must be selected using an aggregating function.
COUNT counts the rows. We can also do COUNT(DISTINCT column_name) to get the number of different values that appear in a column.
SUM adds the values of a numeric column
MIN calculates the minimum of a numeric column
MAX calculates the maximum of a numeric column
AVG calculates the mean of a numeric column
MEDIAN calculates the median (middle value) of a numeric column
MODE calculates the mode (most common value) of a numeric column
GROUP BY statement allows you to group data as you aggregate it. The following query counts the number of trips (COUNT(*)) for each member type (GROUP BY member_type).
member_type, and the column named n_trips counts the number of rows for each type. Recall that AS _______ gives a name to a calculated value - any time we do an aggregation, it will be helpful to name the result so we know what we are seeing in the output. By giving the aggregated value a name, we can use it in the ORDER BY to sort by the total number of trips we calculated.
member_type and try to select the start_date. There is no one start date for all of those records. You would instead have to ask about the MIN(start_station) or MODE(start_station) or COUNT(DISTINCT start_station).
GROUP BY clause must be selected using an aggregating function.
start_station, we SELECT start_station. All the other columns we are selecting data from (duration which is selected from twice) have an aggregation function (MAX or MIN in this case).
WHERE before the GROUP BY. For example, you can calculate the number of trips started from each station for just the bike ’W00174’ with this query: