Refining Your Queries in SQL

Published in

Oretes Academy

3 min readFeb 16, 2021

Relational databases store information in tables, with columns that are analogous to elements in a data structure and rows which are one instance of that data structure. SQL Select retrieves those rows that satisfy the search criteria.

SQL Group By — along with Having and Order By — gives you the power to fine-tune the results, making it easier to deliver actionable data directly, without further massaging before sharing.

Let’s start with perfumeries, a large table of fragrance houses worldwide (the source to make the table appears at the bottom of this post).

Using SQL Where to Narrow Results

We can, for example, get all the houses in a given country by:

SELECT * FROM perfumeries
WHERE country = "Argentina" ;

Getting all of them from a region — say North America — is not much more involved when we use the WHERE IN clause-operator pairing:

SELECT * FROM perfumeries
WHERE country IN ( "Canada", "United States", "Mexico" ) ;

SQL Group By Makes Lists

Let’s pivot, though, and start looking at the data in the aggregate. Perhaps we’re interested in improving import/export flow and we should target our legislative efforts at those countries that have the most fragrance houses. Since we’re not interested in countries that don’t have any fragrance houses, let’s generate a simple list of those which do. SQL Group By gives us the first step in this direction:

SELECT country FROM perfumeries
GROUP BY country ;

This removes all duplicates from the results.

SQL Group By and count(column)

A count of the number of fragrance houses in each country would help target our attention.

SELECT country, COUNT(house) FROM perfumeries
GROUP BY country ;

The syntactic magic contained in COUNT(house) is worth a moment’s pause. SQL will count all the elements specified and return the sum.

Adding SQL Order By Is More Actionable

A good start, the countries in alphabetical order isn’t exactly what was wanted. Using SQL Order By will refine the list further (because asking to group by COUNT(house) is syntactically invalid):

SELECT country, COUNT(house) FROM perfumeries
GROUP BY country
ORDER BY COUNT(house) DESC ;

DESC means “descending order,” ASC is “ascending order.”

Use SQL Having to Extract Exactly

All the examples above show the first lines of a much longer results table. Let’s get at exactly the data we want and remove all human post-processing from our code. This will (1) remove a source of human exhaustion and error and (2) make our code ready for participating in a business logic workflow.

SELECT country, COUNT(house) FROM perfumeries
GROUP BY country
HAVING COUNT(house) >= 5
ORDER BY COUNT(house) DESC ;

Use SQL Where Like for Even Greater Granularity

To end, in the spirit of having our SQL return exactly what’s needed for the task at hand (without any post-processing), a moment taken with pattern-matching wildcards is in order. Consider this refinement:

SELECT country FROM perfumeries
WHERE country LIKE 'K%'
GROUP BY country ;

The WHERE LIKE clause-operator pairing — country LIKE 'K%' — further restricts the results set to those countries which match the pattern “begins with the letter K.” Pattern-matching is a rich subject and I hope to share the power unlocked within in a future blog post.

Conclusion

SQL Group By — coupled with its cousins Having, Order By, Where In, and Where Like — along with related SQL functions — like COUNT(column) — enables you to fine-tune results, making it easier to filter through and deliver actionable data directly to your audience, whether that be human readers or another program waiting for input, without the time-consuming and error-prone human post-processing.

To learn more about SQL, enroll in our Oretes Academy program.